[jira] [Created] (SOLR-5613) Upgrade Apache Commons Codec to version 1.9 in order to improve performance of BeiderMorseFilter
Thomas Champagne created SOLR-5613: -- Summary: Upgrade Apache Commons Codec to version 1.9 in order to improve performance of BeiderMorseFilter Key: SOLR-5613 URL: https://issues.apache.org/jira/browse/SOLR-5613 Project: Solr Issue Type: Improvement Components: Rules, Schema and Analysis, search Affects Versions: 4.6, 4.5.1, 4.5, 4.4, 4.3.1, 4.3, 4.2.1, 4.2, 4.1, 4.0, 3.6.2, 3.6.1, 3.6 Reporter: Thomas Champagne In version 1.9 of commons-codec project, there are a lot of optimizations in the Beider Morse encoder. This is used by the BeiderMorseFilter in Solr. Do you think it is possible to upgrade this dependency ? -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5288) Delta import is calling applyTranformer() during deltaQuerry and causing ScriptException
[ https://issues.apache.org/jira/browse/SOLR-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864057#comment-13864057 ] Daniele Baldi commented on SOLR-5288: - Hi, I found this error while experimenting delta import using TemplateTransformer: WARN : TemplateTransformer : Unable to resolve variable: variableName while parsing expression: ${variableName} This error is thrown because SOLR try to apply transformers on deltaQuery, too. I also think transformation is not required for deltaQuery. Thanks Daniele Delta import is calling applyTranformer() during deltaQuerry and causing ScriptException Key: SOLR-5288 URL: https://issues.apache.org/jira/browse/SOLR-5288 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.4 Reporter: Balaji Manoharan Priority: Critical While experimenting delta import, was getting Script Exception such as 'toString()' is not found on null. These are the queries that am using a) Query SELECT PK_FIELD, JOIN_DATE, USER_NAME FROM USERS b) Delta Query SELECY PK_FIELD FROM USERS WHERE LAST_MODIFIED_DATE '${dih.last_index_time}' c) Delta Import Query SELECT PK_FIELD, JOIN_DATE, USER_NAME FROM USERS WHERE PK_FIELD = '${dih.delta.PK_FIELD}' Have a script transformer as below function dynamicData(){ var joinDt = row.get('JOIN_DATE'); var dtDisplay = joinDt.toString(); //e.g to show that am not doing null check since join_date is a not null field ... ... return row; } entity name=user transformer=script:dynamicData .. ... /entity Problem: While performing delta import, was getting exception from Rhino engine on the script line 'joinDt.toString()'. The exception trace is as follows Caused by: javax.script.ScriptException: sun.org.mozilla.javascript.internal.EcmaError: TypeError: Cannot call method t oString of null (Unknown source#4) in Unknown source at line number 4 at com.sun.script.javascript.RhinoScriptEngine.invoke(RhinoScriptEngine.java:300) at com.sun.script.javascript.RhinoScriptEngine.invokeFunction(RhinoScriptEngine.java:258) at org.apache.solr.handler.dataimport.ScriptTransformer.transformRow(ScriptTransformer.java:56) ... 8 more Root Cause: Since I know join_date can not be null, have explored the solr source code and noticed that applyTransformer() is called during deltaQuery and at that time join_date will not be available. Reference: EntityProcessorWrapper.nextModifiedRowKey() I think transformation is not required for deltaQuery since it is mainly designed to retrieving the primary keys of the modified rows. Further, the output of deltaQuery will be used only in another SQL. Work around: Just added a null check as a workaround as below function dynamicData(){ var joinDt = row.get('JOIN_DATE'); if(joinDt == null){ return row; } ... ... return row; } I don't have too much knowledge about Solr and hence my suggestion could be invalid while looking from main use cases. Please validate my comments once. Thanks Balaji -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica
[ https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864141#comment-13864141 ] Markus Jelsma commented on SOLR-4260: - Ok, I followed all the great work here and in related tickets and yesterday i had the time to rebuild Solr and check for this issue. I hadn't seen it yesterday but it is right in front of me again, using a fresh build from January 6th. Leader has Num Docs: 379659 Replica has Num Docs: 379661 Inconsistent numDocs between leader and replica --- Key: SOLR-4260 URL: https://issues.apache.org/jira/browse/SOLR-4260 Project: Solr Issue Type: Bug Components: SolrCloud Environment: 5.0.0.2013.01.04.15.31.51 Reporter: Markus Jelsma Assignee: Mark Miller Priority: Critical Fix For: 5.0, 4.7 Attachments: 192.168.20.102-replica1.png, 192.168.20.104-replica2.png, clusterstate.png After wiping all cores and reindexing some 3.3 million docs from Nutch using CloudSolrServer we see inconsistencies between the leader and replica for some shards. Each core hold about 3.3k documents. For some reason 5 out of 10 shards have a small deviation in then number of documents. The leader and slave deviate for roughly 10-20 documents, not more. Results hopping ranks in the result set for identical queries got my attention, there were small IDF differences for exactly the same record causing a record to shift positions in the result set. During those tests no records were indexed. Consecutive catch all queries also return different number of numDocs. We're running a 10 node test cluster with 10 shards and a replication factor of two and frequently reindex using a fresh build from trunk. I've not seen this issue for quite some time until a few days ago. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5379) Query-time multi-word synonym expansion
[ https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864147#comment-13864147 ] Markus Jelsma commented on SOLR-5379: - How does this patch handle boosts? Are the synonym and the original keywords boosted equally? Query-time multi-word synonym expansion --- Key: SOLR-5379 URL: https://issues.apache.org/jira/browse/SOLR-5379 Project: Solr Issue Type: Improvement Components: query parsers Reporter: Tien Nguyen Manh Labels: multi-word, queryparser, synonym Fix For: 4.7 Attachments: quoted.patch, synonym-expander.patch While dealing with synonym at query time, solr failed to work with multi-word synonyms due to some reasons: - First the lucene queryparser tokenizes user query by space so it split multi-word term into two terms before feeding to synonym filter, so synonym filter can't recognized multi-word term to do expansion - Second, if synonym filter expand into multiple terms which contains multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to handle synonyms. But MultiPhraseQuery don't work with term have different number of words. For the first one, we can extend quoted all multi-word synonym in user query so that lucene queryparser don't split it. There are a jira task related to this one https://issues.apache.org/jira/browse/LUCENE-2605. For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery SHOULD which contains multiple PhraseQuery in case tokens stream have multi-word synonym. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5379) Query-time multi-word synonym expansion
[ https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864149#comment-13864149 ] Ahmet Arslan commented on SOLR-5379: Assume synonyms are {code} usa, united states of america {code} What happens if I fire the following sloppy phrase query *president usa~5* Query-time multi-word synonym expansion --- Key: SOLR-5379 URL: https://issues.apache.org/jira/browse/SOLR-5379 Project: Solr Issue Type: Improvement Components: query parsers Reporter: Tien Nguyen Manh Labels: multi-word, queryparser, synonym Fix For: 4.7 Attachments: quoted.patch, synonym-expander.patch While dealing with synonym at query time, solr failed to work with multi-word synonyms due to some reasons: - First the lucene queryparser tokenizes user query by space so it split multi-word term into two terms before feeding to synonym filter, so synonym filter can't recognized multi-word term to do expansion - Second, if synonym filter expand into multiple terms which contains multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to handle synonyms. But MultiPhraseQuery don't work with term have different number of words. For the first one, we can extend quoted all multi-word synonym in user query so that lucene queryparser don't split it. There are a jira task related to this one https://issues.apache.org/jira/browse/LUCENE-2605. For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery SHOULD which contains multiple PhraseQuery in case tokens stream have multi-word synonym. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5614) Boost documents using map and query functions
Anca Kopetz created SOLR-5614: - Summary: Boost documents using map and query functions Key: SOLR-5614 URL: https://issues.apache.org/jira/browse/SOLR-5614 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: Anca Kopetz We want to boost documents that contain specific search terms in its fields. We tried the following simplified query : http://localhost:8983/solr/collection1/select?q=ipod belkinwt=xmldebugQuery=trueq.op=ANDdefType=edismaxbf=map(query($qq),0,0,0,100.0)qq={!edismax}power And we get the following error : org.apache.solr.search.SyntaxError: Infinite Recursion detected parsing query 'power' And the stacktrace : ERROR - 2014-01-06 18:27:02.275; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError: Infinite Recursion detected parsing query 'power' at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:171) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:710) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:413) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.solr.search.SyntaxError: Infinite Recursion detected parsing query 'power' at org.apache.solr.search.QParser.checkRecurse(QParser.java:178) at org.apache.solr.search.QParser.subQuery(QParser.java:200) at org.apache.solr.search.ExtendedDismaxQParser.getBoostFunctions(ExtendedDismaxQParser.java:437) at org.apache.solr.search.ExtendedDismaxQParser.parse(ExtendedDismaxQParser.java:175) at org.apache.solr.search.QParser.getQuery(QParser.java:142) at org.apache.solr.search.FunctionQParser.parseNestedQuery(FunctionQParser.java:236) at org.apache.solr.search.ValueSourceParser$19.parse(ValueSourceParser.java:270) at org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:352) at org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:223) at org.apache.solr.search.ValueSourceParser$13.parse(ValueSourceParser.java:198) at
[jira] [Updated] (SOLR-5614) Boost documents using map and query functions
[ https://issues.apache.org/jira/browse/SOLR-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anca Kopetz updated SOLR-5614: -- Description: We want to boost documents that contain specific search terms in its fields. We tried the following simplified query : http://localhost:8983/solr/collection1/select?q=ipod%20belkinwt=xmldebugQuery=trueq.op=ANDdefType=edismaxbf=map(query($qq),0,0,0,100.0)qq={!edismax}power And we get the following error : org.apache.solr.search.SyntaxError: Infinite Recursion detected parsing query 'power' And the stacktrace : ERROR - 2014-01-06 18:27:02.275; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError: Infinite Recursion detected parsing query 'power' at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:171) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:710) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:413) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.solr.search.SyntaxError: Infinite Recursion detected parsing query 'power' at org.apache.solr.search.QParser.checkRecurse(QParser.java:178) at org.apache.solr.search.QParser.subQuery(QParser.java:200) at org.apache.solr.search.ExtendedDismaxQParser.getBoostFunctions(ExtendedDismaxQParser.java:437) at org.apache.solr.search.ExtendedDismaxQParser.parse(ExtendedDismaxQParser.java:175) at org.apache.solr.search.QParser.getQuery(QParser.java:142) at org.apache.solr.search.FunctionQParser.parseNestedQuery(FunctionQParser.java:236) at org.apache.solr.search.ValueSourceParser$19.parse(ValueSourceParser.java:270) at org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:352) at org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:223) at org.apache.solr.search.ValueSourceParser$13.parse(ValueSourceParser.java:198) at org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:352) at org.apache.solr.search.FunctionQParser.parse(FunctionQParser.java:68) at
[jira] [Updated] (SOLR-5614) Boost documents using map and query functions
[ https://issues.apache.org/jira/browse/SOLR-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anca Kopetz updated SOLR-5614: -- Description: We want to boost documents that contain specific search terms in its fields. We tried the following simplified query : http://localhost:8983/solr/collection1/select?q=ipod belkinwt=xmldebugQuery=trueq.op=ANDdefType=edismaxbf=map(query($qq),0,0,0,100.0)qq={!edismax}power And we get the following error : org.apache.solr.search.SyntaxError: Infinite Recursion detected parsing query 'power' And the stacktrace : ERROR - 2014-01-06 18:27:02.275; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError: Infinite Recursion detected parsing query 'power' at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:171) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:710) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:413) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.solr.search.SyntaxError: Infinite Recursion detected parsing query 'power' at org.apache.solr.search.QParser.checkRecurse(QParser.java:178) at org.apache.solr.search.QParser.subQuery(QParser.java:200) at org.apache.solr.search.ExtendedDismaxQParser.getBoostFunctions(ExtendedDismaxQParser.java:437) at org.apache.solr.search.ExtendedDismaxQParser.parse(ExtendedDismaxQParser.java:175) at org.apache.solr.search.QParser.getQuery(QParser.java:142) at org.apache.solr.search.FunctionQParser.parseNestedQuery(FunctionQParser.java:236) at org.apache.solr.search.ValueSourceParser$19.parse(ValueSourceParser.java:270) at org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:352) at org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:223) at org.apache.solr.search.ValueSourceParser$13.parse(ValueSourceParser.java:198) at org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:352) at org.apache.solr.search.FunctionQParser.parse(FunctionQParser.java:68) at
[jira] [Commented] (SOLR-5609) Don't let cores create slices/named replicas
[ https://issues.apache.org/jira/browse/SOLR-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864241#comment-13864241 ] Noble Paul commented on SOLR-5609: -- makes sense to have an omnibus property like legacyCloudMode rather than having specific properties for each behavior. Don't let cores create slices/named replicas Key: SOLR-5609 URL: https://issues.apache.org/jira/browse/SOLR-5609 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Fix For: 5.0, 4.7 In SolrCloud, it is possible for a core to come up in any node , and register itself with an arbitrary slice/coreNodeName. This is a legacy requirement and we would like to make it only possible for Overseer to initiate creation of slice/replicas We plan to introduce cluster level properties at the top level /cluster-props.json {code:javascript} { noSliceOrReplicaByCores:true } {code} If this property is set to true, cores won't be able to send STATE commands with unknown slice/coreNodeName . Those commands will fail at Overseer. This is useful for SOLR-5310 / SOLR-5311 where a core/replica is deleted by a command and it comes up later and tries to create a replica/slice -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5476) Overseer Role for nodes
[ https://issues.apache.org/jira/browse/SOLR-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-5476: - Attachment: SOLR-5476.patch Overseer Role for nodes --- Key: SOLR-5476 URL: https://issues.apache.org/jira/browse/SOLR-5476 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Attachments: SOLR-5476.patch, SOLR-5476.patch, SOLR-5476.patch, SOLR-5476.patch In a very large cluster the Overseer is likely to be overloaded.If the same node is a serving a few other shards it can lead to OverSeer getting slowed down due to GC pauses , or simply too much of work . If the cluster is really large , it is possible to dedicate high end h/w for OverSeers It works as a new collection admin command command=addrolerole=overseernode=192.168.1.5:8983_solr This results in the creation of a entry in the /roles.json in ZK which would look like the following {code:javascript} { overseer : [192.168.1.5:8983_solr] } {code} If a node is designated for overseer it gets preference over others when overseer election takes place. If no designated servers are available another random node would become the Overseer. Later on, if one of the designated nodes are brought up ,it would take over the Overseer role from the current Overseer to become the Overseer of the system -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
Ramkumar Aiyengar created SOLR-5615: --- Summary: Deadlock while trying to recover after a ZK session expiry Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6, 4.5, 4.4 Reporter: Ramkumar Aiyengar The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for the shard - OL tries to register itself as a replica, and waits till the cluster state is updated with the new leader from event thread - ZK sends a watch update to OL, but it is blocked on the event thread waiting for it. Oops. This finally breaks out after trying to register itself as replica times out after 20 mins. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
lucene-solr pull request: Allow ConnectionManager.process to run from multi...
GitHub user andyetitmoves opened a pull request: https://github.com/apache/lucene-solr/pull/13 Allow ConnectionManager.process to run from multiple threads One potential fix for SOLR-5615. Hardly sure about whether this is the correct way to go about this, but it's a start I guess.. You can merge this pull request into a Git repository by running: $ git pull https://github.com/andyetitmoves/lucene-solr on-recovery-deadlock-4x Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/13.patch commit ad7ac506bc614d43f391aaad7ab25d9b426421c4 Author: Ramkumar Aiyengar raiyen...@bloomberg.net Date: 2014-01-07T11:57:25Z Allow ConnectionManager.process to run from multiple threads - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864250#comment-13864250 ] Ramkumar Aiyengar commented on SOLR-5615: - Submitted https://github.com/apache/lucene-solr/pull/13 for one possible solution, though I am not sure if this is the right way to go about this.. Deadlock while trying to recover after a ZK session expiry -- Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 4.6 Reporter: Ramkumar Aiyengar The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for the shard - OL tries to register itself as a replica, and waits till the cluster state is updated with the new leader from event thread - ZK sends a watch update to OL, but it is blocked on the event thread waiting for it. Oops. This finally breaks out after trying to register itself as replica times out after 20 mins. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5354) Blended score in AnalyzingInfixSuggester
[ https://issues.apache.org/jira/browse/LUCENE-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864328#comment-13864328 ] Remi Melisson commented on LUCENE-5354: --- Hi, any news about this feature ? Could I do anything else ? Blended score in AnalyzingInfixSuggester Key: LUCENE-5354 URL: https://issues.apache.org/jira/browse/LUCENE-5354 Project: Lucene - Core Issue Type: Improvement Components: modules/spellchecker Affects Versions: 4.4 Reporter: Remi Melisson Priority: Minor Labels: suggester Attachments: LUCENE-5354.patch, LUCENE-5354_2.patch I'm working on a custom suggester derived from the AnalyzingInfix. I require what is called a blended score (//TODO ln.399 in AnalyzingInfixSuggester) to transform the suggestion weights depending on the position of the searched term(s) in the text. Right now, I'm using an easy solution : If I want 10 suggestions, then I search against the current ordered index for the 100 first results and transform the weight : bq. a) by using the term position in the text (found with TermVector and DocsAndPositionsEnum) or bq. b) by multiplying the weight by the score of a SpanQuery that I add when searching and return the updated 10 most weighted suggestions. Since we usually don't need to suggest so many things, the bigger search + rescoring overhead is not so significant but I agree that this is not the most elegant solution. We could include this factor (here the position of the term) directly into the index. So, I can contribute to this if you think it's worth adding it. Do you think I should tweak AnalyzingInfixSuggester, subclass it or create a dedicated class ? -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5615: -- Attachment: SOLR-5615.patch Not sure given the info, but the patch doesn't seem crazy to me. I've made a few adjustments in this patch. Deadlock while trying to recover after a ZK session expiry -- Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 4.6 Reporter: Ramkumar Aiyengar Attachments: SOLR-5615.patch The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for the shard - OL tries to register itself as a replica, and waits till the cluster state is updated with the new leader from event thread - ZK sends a watch update to OL, but it is blocked on the event thread waiting for it. Oops. This finally breaks out after trying to register itself as replica times out after 20 mins. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5613) Upgrade Apache Commons Codec to version 1.9 in order to improve performance of BeiderMorseFilter
[ https://issues.apache.org/jira/browse/SOLR-5613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864364#comment-13864364 ] Shawn Heisey commented on SOLR-5613: I upgraded commons-codec to 1.9 on an up-to-date branch_4x checkout and found that all tests (both Lucene and Solr) passed. This was on a linux machine. I wasn't too surprised by this. I think we can accommodate this request easily. Just for giggles, I went even further and upgraded all commons.apache.org components to the newest versions I could find via ivy. All tests *still* passed. This was on a Windows 8 machine. With so many upgrades, I was really surprised it passed. {code} Index: lucene/ivy-versions.properties === --- lucene/ivy-versions.properties (revision 1555313) +++ lucene/ivy-versions.properties (working copy) @@ -19,16 +19,16 @@ /com.ibm.icu/icu4j = 52.1 /com.spatial4j/spatial4j = 0.3 /com.sun.jersey/jersey-core = 1.16 -/commons-beanutils/commons-beanutils = 1.7.0 +/commons-beanutils/commons-beanutils = 1.9.0 /commons-cli/commons-cli = 1.2 -/commons-codec/commons-codec = 1.7 +/commons-codec/commons-codec = 1.9 /commons-collections/commons-collections = 3.2.1 -/commons-configuration/commons-configuration = 1.6 -/commons-digester/commons-digester = 2.0 -/commons-fileupload/commons-fileupload = 1.2.1 -/commons-io/commons-io = 2.1 +/commons-configuration/commons-configuration = 1.10 +/commons-digester/commons-digester = 2.1 +/commons-fileupload/commons-fileupload = 1.3 +/commons-io/commons-io = 2.4 /commons-lang/commons-lang = 2.6 -/commons-logging/commons-logging = 1.1.1 +/commons-logging/commons-logging = 1.1.3 /de.l3s.boilerpipe/boilerpipe = 1.1.0 /dom4j/dom4j = 1.6.1 /edu.ucar/netcdf = 4.2-min {code} I'm not advocating that we upgrade all the components at once, but it looks like we can indeed upgrade them all eventually. I only ran the basic tests, so additional tests (nightly, weekly, etc) need to be done. Upgrade Apache Commons Codec to version 1.9 in order to improve performance of BeiderMorseFilter Key: SOLR-5613 URL: https://issues.apache.org/jira/browse/SOLR-5613 Project: Solr Issue Type: Improvement Components: Rules, Schema and Analysis, search Affects Versions: 3.6, 3.6.1, 3.6.2, 4.0, 4.1, 4.2, 4.2.1, 4.3, 4.3.1, 4.4, 4.5, 4.5.1, 4.6 Reporter: Thomas Champagne Labels: codec, commons, commons-codec, phonetic, search In version 1.9 of commons-codec project, there are a lot of optimizations in the Beider Morse encoder. This is used by the BeiderMorseFilter in Solr. Do you think it is possible to upgrade this dependency ? -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)
[ https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-5463: --- Description: I'd like to revist a solution to the problem of deep paging in Solr, leveraging an HTTP based API similar to how IndexSearcher.searchAfter works at the lucene level: require the clients to provide back a token indicating the sort values of the last document seen on the previous page. This is similar to the cursor model I've seen in several other REST APIs that support pagnation over a large sets of results (notable the twitter API and it's since_id param) except that we'll want something that works with arbitrary multi-level sort critera that can be either ascending or descending. SOLR-1726 laid some initial ground work here and was commited quite a while ago, but the key bit of argument parsing to leverage it was commented out due to some problems (see comments in that issue). It's also somewhat out of date at this point: at the time it was commited, IndexSearcher only supported searchAfter for simple scores, not arbitrary field sorts; and the params added in SOLR-1726 suffer from this limitation as well. --- I think it would make sense to start fresh with a new issue with a focus on ensuring that we have deep paging which: * supports arbitrary field sorts in addition to sorting by score * works in distributed mode {panel:title=Basic Usage} * send a request with {{sort=Xstart=0rows=NcursorMark=*}} ** sort can be anything, but must include the uniqueKey field (as a tie breaker) ** N can be any number you want per page ** start must be 0 ** \* denotes you want to use a cursor starting at the beginning mark * parse the response body and extract the (String) {{nextCursorMark}} value * Replace the \* value in your initial request params with the {{nextCursorMark}} value from the response in the subsequent request * repeat until the {{nextCursorMark}} value stops changing, or you have collected as many docs as you need {panel} was: I'd like to revist a solution to the problem of deep paging in Solr, leveraging an HTTP based API similar to how IndexSearcher.searchAfter works at the lucene level: require the clients to provide back a token indicating the sort values of the last document seen on the previous page. This is similar to the cursor model I've seen in several other REST APIs that support pagnation over a large sets of results (notable the twitter API and it's since_id param) except that we'll want something that works with arbitrary multi-level sort critera that can be either ascending or descending. SOLR-1726 laid some initial ground work here and was commited quite a while ago, but the key bit of argument parsing to leverage it was commented out due to some problems (see comments in that issue). It's also somewhat out of date at this point: at the time it was commited, IndexSearcher only supported searchAfter for simple scores, not arbitrary field sorts; and the params added in SOLR-1726 suffer from this limitation as well. --- I think it would make sense to start fresh with a new issue with a focus on ensuring that we have deep paging which: * supports arbitrary field sorts in addition to sorting by score * works in distributed mode Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging) -- Key: SOLR-5463 URL: https://issues.apache.org/jira/browse/SOLR-5463 Project: Solr Issue Type: New Feature Reporter: Hoss Man Assignee: Hoss Man Fix For: 5.0 Attachments: SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man__MissingStringLastComparatorSource.patch I'd like to revist a solution to the problem of deep paging in Solr, leveraging an HTTP based API similar to how IndexSearcher.searchAfter works at the lucene level: require the clients to provide back a token indicating the sort values of the last document seen on the previous page. This is similar to the cursor model I've seen in several other REST APIs that support pagnation over a large sets of results (notable the twitter API and it's since_id param) except that we'll want something that works with arbitrary multi-level sort critera that can be either ascending or descending. SOLR-1726 laid some initial ground work here and was commited quite a while ago, but
[jira] [Commented] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864389#comment-13864389 ] Ramkumar Aiyengar commented on SOLR-5615: - Here's some log trace which actually happened, might help understand the scenario above.. {code} 2014-01-06 06:22:03,867 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:88] Our previous ZooKeeper session was expired. Attempting to reconnect to recover relationship with ZooKeeper... // .. 2014-01-06 06:22:12,529 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:103] Connection with ZooKeeper reestablished. // .. 2014-01-06 06:22:36,573 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:989] publishing core=collection_20131120_shard205_replica2 state=down // .. 2014-01-06 06:28:01,479 INFO [main-EventThread] o.a.s.c.c.ZkStateReader [ZkStateReader.java:199] Updating cluster state from ZooKeeper... 2014-01-06 06:28:01,487 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:651] Register node as live in ZooKeeper:/live_nodes/host5:10750_solr // See trace above, it directly got cluster state from ZK and successfully found the leader, so there is actually a leader at this point contrary to what it finds below 2014-01-06 06:28:01,567 INFO [main-EventThread] o.a.s.c.c.SolrZkClient [SolrZkClient.java:378] makePath: /live_nodes/host5:10750_solr 2014-01-06 06:28:01,669 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:757] Register replica - core:collection_20131120_shard241_replica2 address:http://host5:10750/solr collection:collection_20131120 shard:shard241 2014-01-06 06:28:01,669 INFO [main-EventThread] o.a.s.c.s.i.HttpClientUtil [HttpClientUtil.java:103] Creating new http client, config:maxConnections=1maxConnectionsPerHost=20connTimeout=3socketTimeout=3retry=false // nothing much after this on main-EventThread for 20 mins.. 2014-01-06 06:54:01,786 ERROR [main-EventThread] o.a.s.c.ZkController [ZkController.java:869] Error getting leader from zk org.apache.solr.common.SolrException: No registered leader was found, collection:collection_20131120 slice:shard241 // Then goes on to the next replica .. 2014-01-06 06:54:01,786 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:757] Register replica - core:collection_20131120_shard209_replica2 address:http://host5:10750/solr collection:collection_20131120 shard:shard209 2014-01-06 06:54:01,786 INFO [main-EventThread] o.a.s.c.s.i.HttpClientUtil [HttpClientUtil.java:103] Creating new http client, config:maxConnections=1maxConnectionsPerHost=20connTimeout=3socketTimeout=3retry=false // waits another twenty mins (by which time I ordered a shutdown, so things started erroring out sooner after that) 2014-01-06 07:19:21,656 ERROR [main-EventThread] o.a.s.c.ZkController [ZkController.java:869] Error getting leader from zk org.apache.solr.common.SolrException: No registered leader was found, collection:collection_20131120 slice:shard209 // After trying to register all other replicas, these failed fast because we had ordered a shutdown already.. 2014-01-06 07:19:21,693 INFO [main-EventThread] o.a.s.c.c.DefaultConnectionStrategy [DefaultConnectionStrategy.java:48] Reconnected to ZooKeeper 2014-01-06 07:19:21,693 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:130] Connected:true // And immediately, *now* it fires all the events it was waiting for! 2014-01-06 07:19:21,693 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@2467da0a name:ZooKeeperConnection Watcher:host1:11600,host2:11600,host3:11600 got event WatchedEvent state:Disconnected type:None path:null path:null type:None 2014-01-06 07:19:21,693 INFO [main-EventThread] o.a.z.ClientCnxn [ClientCnxn.java:509] EventThread shut down {code} Deadlock while trying to recover after a ZK session expiry -- Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 4.6 Reporter: Ramkumar Aiyengar Attachments: SOLR-5615.patch The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for
[jira] [Comment Edited] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864389#comment-13864389 ] Ramkumar Aiyengar edited comment on SOLR-5615 at 1/7/14 5:02 PM: - Here's some log trace which actually happened, might help understand the scenario above.. {code} 2014-01-06 06:22:03,867 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:88] Our previous ZooKeeper session was expired. Attempting to reconnect to recover relationship with ZooKeeper... // .. 2014-01-06 06:22:12,529 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:103] Connection with ZooKeeper reestablished. // .. 2014-01-06 06:22:36,573 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:989] publishing core=collection_20131120_shard205_replica2 state=down // .. 2014-01-06 06:28:01,479 INFO [main-EventThread] o.a.s.c.c.ZkStateReader [ZkStateReader.java:199] Updating cluster state from ZooKeeper... 2014-01-06 06:28:01,487 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:651] Register node as live in ZooKeeper:/live_nodes/host5:10750_solr // See trace above, it directly got leader props from ZK successfully, so there is actually a leader at this point contrary to what it finds below 2014-01-06 06:28:01,567 INFO [main-EventThread] o.a.s.c.c.SolrZkClient [SolrZkClient.java:378] makePath: /live_nodes/host5:10750_solr 2014-01-06 06:28:01,669 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:757] Register replica - core:collection_20131120_shard241_replica2 address:http://host5:10750/solr collection:collection_20131120 shard:shard241 2014-01-06 06:28:01,669 INFO [main-EventThread] o.a.s.c.s.i.HttpClientUtil [HttpClientUtil.java:103] Creating new http client, config:maxConnections=1maxConnectionsPerHost=20connTimeout=3socketTimeout=3retry=false // nothing much after this on main-EventThread for 20 mins.. 2014-01-06 06:54:01,786 ERROR [main-EventThread] o.a.s.c.ZkController [ZkController.java:869] Error getting leader from zk org.apache.solr.common.SolrException: No registered leader was found, collection:collection_20131120 slice:shard241 // Then goes on to the next replica .. 2014-01-06 06:54:01,786 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:757] Register replica - core:collection_20131120_shard209_replica2 address:http://host5:10750/solr collection:collection_20131120 shard:shard209 2014-01-06 06:54:01,786 INFO [main-EventThread] o.a.s.c.s.i.HttpClientUtil [HttpClientUtil.java:103] Creating new http client, config:maxConnections=1maxConnectionsPerHost=20connTimeout=3socketTimeout=3retry=false // waits another twenty mins (by which time I ordered a shutdown, so things started erroring out sooner after that) 2014-01-06 07:19:21,656 ERROR [main-EventThread] o.a.s.c.ZkController [ZkController.java:869] Error getting leader from zk org.apache.solr.common.SolrException: No registered leader was found, collection:collection_20131120 slice:shard209 // After trying to register all other replicas, these failed fast because we had ordered a shutdown already.. 2014-01-06 07:19:21,693 INFO [main-EventThread] o.a.s.c.c.DefaultConnectionStrategy [DefaultConnectionStrategy.java:48] Reconnected to ZooKeeper 2014-01-06 07:19:21,693 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:130] Connected:true // And immediately, *now* it fires all the events it was waiting for! 2014-01-06 07:19:21,693 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@2467da0a name:ZooKeeperConnection Watcher:host1:11600,host2:11600,host3:11600 got event WatchedEvent state:Disconnected type:None path:null path:null type:None 2014-01-06 07:19:21,693 INFO [main-EventThread] o.a.z.ClientCnxn [ClientCnxn.java:509] EventThread shut down {code} was (Author: andyetitmoves): Here's some log trace which actually happened, might help understand the scenario above.. {code} 2014-01-06 06:22:03,867 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:88] Our previous ZooKeeper session was expired. Attempting to reconnect to recover relationship with ZooKeeper... // .. 2014-01-06 06:22:12,529 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:103] Connection with ZooKeeper reestablished. // .. 2014-01-06 06:22:36,573 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:989] publishing core=collection_20131120_shard205_replica2 state=down // .. 2014-01-06 06:28:01,479 INFO [main-EventThread] o.a.s.c.c.ZkStateReader [ZkStateReader.java:199] Updating cluster state from ZooKeeper... 2014-01-06 06:28:01,487 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:651] Register node as live in
[jira] [Created] (SOLR-5616) Make grouping code use response builder needDocList
Steven Bower created SOLR-5616: -- Summary: Make grouping code use response builder needDocList Key: SOLR-5616 URL: https://issues.apache.org/jira/browse/SOLR-5616 Project: Solr Issue Type: Bug Reporter: Steven Bower Right now the grouping code does this to check if it needs to generate a docList for grouped results: {code} if (rb.doHighlights || rb.isDebug() || params.getBool(MoreLikeThisParams.MLT, false) ){ ... } {code} this is ugly because any new component that needs a doclist, from grouped results, will need to modify QueryComponent to add a check to this if. Ideally this should just use the rb.isNeedDocList() flag... Coincidentally this boolean is really never used at for non-grouped results it always gets generated.. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864389#comment-13864389 ] Ramkumar Aiyengar edited comment on SOLR-5615 at 1/7/14 5:04 PM: - Here's some log trace which actually happened, might help understand the scenario above.. {code} 2014-01-06 06:22:03,867 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:88] Our previous ZooKeeper session was expired. Attempting to reconnect to recover relationship with ZooKeeper... // .. 2014-01-06 06:22:12,529 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:103] Connection with ZooKeeper reestablished. // .. 2014-01-06 06:22:36,573 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:989] publishing core=collection_20131120_shard205_replica2 state=down // .. 2014-01-06 06:28:01,479 INFO [main-EventThread] o.a.s.c.c.ZkStateReader [ZkStateReader.java:199] Updating cluster state from ZooKeeper... 2014-01-06 06:28:01,487 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:651] Register node as live in ZooKeeper:/live_nodes/host5:10750_solr // See trace above, it directly got leader props from ZK successfully, so there is actually a leader at this point contrary to what it finds below 2014-01-06 06:28:01,567 INFO [main-EventThread] o.a.s.c.c.SolrZkClient [SolrZkClient.java:378] makePath: /live_nodes/host5:10750_solr 2014-01-06 06:28:01,669 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:757] Register replica - core:collection_20131120_shard241_replica2 address:http://host5:10750/solr collection:collection_20131120 shard:shard241 2014-01-06 06:28:01,669 INFO [main-EventThread] o.a.s.c.s.i.HttpClientUtil [HttpClientUtil.java:103] Creating new http client, config:maxConnections=1maxConnectionsPerHost=20connTimeout=3socketTimeout=3retry=false // nothing much after this on main-EventThread for 20 mins.. 2014-01-06 06:54:01,786 ERROR [main-EventThread] o.a.s.c.ZkController [ZkController.java:869] Error getting leader from zk org.apache.solr.common.SolrException: No registered leader was found, collection:collection_20131120 slice:shard241 // Then goes on to the next replica .. 2014-01-06 06:54:01,786 INFO [main-EventThread] o.a.s.c.ZkController [ZkController.java:757] Register replica - core:collection_20131120_shard209_replica2 address:http://host5:10750/solr collection:collection_20131120 shard:shard209 2014-01-06 06:54:01,786 INFO [main-EventThread] o.a.s.c.s.i.HttpClientUtil [HttpClientUtil.java:103] Creating new http client, config:maxConnections=1maxConnectionsPerHost=20connTimeout=3socketTimeout=3retry=false // waits another twenty mins (by which time I ordered a shutdown, so things started erroring out sooner after that) 2014-01-06 07:19:21,656 ERROR [main-EventThread] o.a.s.c.ZkController [ZkController.java:869] Error getting leader from zk org.apache.solr.common.SolrException: No registered leader was found, collection:collection_20131120 slice:shard209 // After trying to register all other replicas, these failed fast because we had ordered a shutdown already.. 2014-01-06 07:19:21,693 INFO [main-EventThread] o.a.s.c.c.DefaultConnectionStrategy [DefaultConnectionStrategy.java:48] Reconnected to ZooKeeper 2014-01-06 07:19:21,693 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:130] Connected:true // And immediately, *now* it fires all the events it was waiting for! 2014-01-06 07:19:21,693 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@2467da0a name:ZooKeeperConnection Watcher:host1:11600,host2:11600,host3:11600 got event WatchedEvent state:Disconnected type:None path:null path:null type:None 2014-01-06 07:19:21,693 INFO [main-EventThread] o.a.z.ClientCnxn [ClientCnxn.java:509] EventThread shut down // many more such disc events, and then the watches 2014-01-06 07:19:21,694 WARN [main-EventThread] o.a.s.c.c.ZkStateReader [ZkStateReader.java:281] ZooKeeper watch triggered, but Solr cannot talk to ZK 2014-01-06 07:19:21,694 INFO [main-EventThread] o.a.s.c.c.ZkStateReader [ZkStateReader.java:210] A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 112) 2014-01-06 07:19:21,694 WARN [main-EventThread] o.a.s.c.c.ZkStateReader [ZkStateReader.java:234] ZooKeeper watch triggered, but Solr cannot talk to ZK {code} was (Author: andyetitmoves): Here's some log trace which actually happened, might help understand the scenario above.. {code} 2014-01-06 06:22:03,867 INFO [main-EventThread] o.a.s.c.c.ConnectionManager [ConnectionManager.java:88] Our previous ZooKeeper session was expired. Attempting to reconnect to recover
[jira] [Updated] (SOLR-5616) Make grouping code use response builder needDocList
[ https://issues.apache.org/jira/browse/SOLR-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Bower updated SOLR-5616: --- Attachment: SOLR-5616.patch Here is a patch that makes this change. It's against trunk but should easily patch onto older versions. Ideally this would get onto a 4.x release.. Make grouping code use response builder needDocList --- Key: SOLR-5616 URL: https://issues.apache.org/jira/browse/SOLR-5616 Project: Solr Issue Type: Bug Reporter: Steven Bower Attachments: SOLR-5616.patch Right now the grouping code does this to check if it needs to generate a docList for grouped results: {code} if (rb.doHighlights || rb.isDebug() || params.getBool(MoreLikeThisParams.MLT, false) ){ ... } {code} this is ugly because any new component that needs a doclist, from grouped results, will need to modify QueryComponent to add a check to this if. Ideally this should just use the rb.isNeedDocList() flag... Coincidentally this boolean is really never used at for non-grouped results it always gets generated.. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864401#comment-13864401 ] Mark Miller commented on SOLR-5615: --- Thanks, perfect. Deadlock while trying to recover after a ZK session expiry -- Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 4.6 Reporter: Ramkumar Aiyengar Attachments: SOLR-5615.patch The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for the shard - OL tries to register itself as a replica, and waits till the cluster state is updated with the new leader from event thread - ZK sends a watch update to OL, but it is blocked on the event thread waiting for it. Oops. This finally breaks out after trying to register itself as replica times out after 20 mins. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5379) Query-time multi-word synonym expansion
[ https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864402#comment-13864402 ] Nolan Lawson commented on SOLR-5379: [~markus17]: They're boosted equally. It was the subject of [a bug|https://github.com/healthonnet/hon-lucene-synonyms/issues/31]. [~iorixxx]: I just tested it out now. I got: {code} (+(DisjunctionMaxQuery((text:president usa~5)) (((+DisjunctionMaxQuery((text:president united states of america~5)))/no_coord/no_coord // parsedQuery +((text:president usa~5) ((+(text:president united states of america~5 // parsedQuery.toString() {code} Query-time multi-word synonym expansion --- Key: SOLR-5379 URL: https://issues.apache.org/jira/browse/SOLR-5379 Project: Solr Issue Type: Improvement Components: query parsers Reporter: Tien Nguyen Manh Labels: multi-word, queryparser, synonym Fix For: 4.7 Attachments: quoted.patch, synonym-expander.patch While dealing with synonym at query time, solr failed to work with multi-word synonyms due to some reasons: - First the lucene queryparser tokenizes user query by space so it split multi-word term into two terms before feeding to synonym filter, so synonym filter can't recognized multi-word term to do expansion - Second, if synonym filter expand into multiple terms which contains multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to handle synonyms. But MultiPhraseQuery don't work with term have different number of words. For the first one, we can extend quoted all multi-word synonym in user query so that lucene queryparser don't split it. There are a jira task related to this one https://issues.apache.org/jira/browse/LUCENE-2605. For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery SHOULD which contains multiple PhraseQuery in case tokens stream have multi-word synonym. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864434#comment-13864434 ] Mark Miller commented on SOLR-5615: --- Okay, now it's more clear to me. We need to run onReconnect in a background thread I think. Deadlock while trying to recover after a ZK session expiry -- Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 4.6 Reporter: Ramkumar Aiyengar Attachments: SOLR-5615.patch The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for the shard - OL tries to register itself as a replica, and waits till the cluster state is updated with the new leader from event thread - ZK sends a watch update to OL, but it is blocked on the event thread waiting for it. Oops. This finally breaks out after trying to register itself as replica times out after 20 mins. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864446#comment-13864446 ] Ramkumar Aiyengar commented on SOLR-5615: - That, incidentally, was my first attempt at a fix! (Should have a diff..) However, onReconnect in any case runs in the event thread of the expired ZK which wouldn't have events after that, so it's effectively backgrounded? It should still work as a solution I guess.. Deadlock while trying to recover after a ZK session expiry -- Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 4.6 Reporter: Ramkumar Aiyengar Attachments: SOLR-5615.patch The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for the shard - OL tries to register itself as a replica, and waits till the cluster state is updated with the new leader from event thread - ZK sends a watch update to OL, but it is blocked on the event thread waiting for it. Oops. This finally breaks out after trying to register itself as replica times out after 20 mins. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864460#comment-13864460 ] Mark Miller commented on SOLR-5615: --- bq. However, onReconnect in any case runs in the event thread of the expired ZK which wouldn't have events after that, so it's effectively backgrounded? But it holds the ConnectionManager this lock while it runs right? I think we just don't want to hold that lock while it runs. I think the other changes are likely okay too - I'm playing around with a combination of the two. Deadlock while trying to recover after a ZK session expiry -- Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 4.6 Reporter: Ramkumar Aiyengar Attachments: SOLR-5615.patch The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for the shard - OL tries to register itself as a replica, and waits till the cluster state is updated with the new leader from event thread - ZK sends a watch update to OL, but it is blocked on the event thread waiting for it. Oops. This finally breaks out after trying to register itself as replica times out after 20 mins. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5615: -- Attachment: SOLR-5615.patch Another rev. Deadlock while trying to recover after a ZK session expiry -- Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 4.6 Reporter: Ramkumar Aiyengar Attachments: SOLR-5615.patch, SOLR-5615.patch The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for the shard - OL tries to register itself as a replica, and waits till the cluster state is updated with the new leader from event thread - ZK sends a watch update to OL, but it is blocked on the event thread waiting for it. Oops. This finally breaks out after trying to register itself as replica times out after 20 mins. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5615: -- Fix Version/s: 4.6.1 4.7 5.0 Assignee: Mark Miller Deadlock while trying to recover after a ZK session expiry -- Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 4.6 Reporter: Ramkumar Aiyengar Assignee: Mark Miller Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5615.patch, SOLR-5615.patch The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for the shard - OL tries to register itself as a replica, and waits till the cluster state is updated with the new leader from event thread - ZK sends a watch update to OL, but it is blocked on the event thread waiting for it. Oops. This finally breaks out after trying to register itself as replica times out after 20 mins. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Iterating BinaryDocValues
Joel, I tried to hack it straightforwardly, but found no free gain there. The only attempt I can suggest is to try to reuse bytes in https://github.com/apache/lucene-solr/blame/trunk/lucene/core/src/java/org/apache/lucene/codecs/lucene45/Lucene45DocValuesProducer.java#L401right now it allocates bytes every time, which beside of GC can also impact memory access locality. Could you try fix memory waste and repeat performance test? Have a good hack! On Mon, Dec 23, 2013 at 9:51 PM, Joel Bernstein joels...@gmail.com wrote: Hi, I'm looking for a faster way to perform large scale docId - bytesRef lookups for BinaryDocValues. I'm finding that I can't get the performance that I need from the random access seek in the BinaryDocValues interface. I'm wondering if sequentially scanning the docValues would be a faster approach. I have a BitSet of matching docs, so if I sequentially moved through the docValues I could test each one against that bitset. Wondering if that approach would be faster for bulk extracts and how tricky it would be to add an iterator to the BinaryDocValues interface? Thanks, Joel -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
[jira] [Resolved] (SOLR-5614) Boost documents using map and query functions
[ https://issues.apache.org/jira/browse/SOLR-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-5614. Resolution: Invalid please don't file a bug just because you've been waiting 24 hours for an answer to a question on the solr-user mailing list - sometimes it takes longer then that for people to answer. https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201312.mbox/%3c52c17579.30...@kelkoo.com%3E Boost documents using map and query functions - Key: SOLR-5614 URL: https://issues.apache.org/jira/browse/SOLR-5614 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: Anca Kopetz We want to boost documents that contain specific search terms in its fields. We tried the following simplified query : http://localhost:8983/solr/collection1/select?q=ipod belkinwt=xmldebugQuery=trueq.op=ANDdefType=edismaxbf=map(query($qq),0,0,0,100.0)qq={!edismax}power And we get the following error : org.apache.solr.search.SyntaxError: Infinite Recursion detected parsing query 'power' And the stacktrace : ERROR - 2014-01-06 18:27:02.275; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError: Infinite Recursion detected parsing query 'power' at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:171) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:710) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:413) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.solr.search.SyntaxError: Infinite Recursion detected parsing query 'power' at org.apache.solr.search.QParser.checkRecurse(QParser.java:178) at org.apache.solr.search.QParser.subQuery(QParser.java:200) at org.apache.solr.search.ExtendedDismaxQParser.getBoostFunctions(ExtendedDismaxQParser.java:437) at org.apache.solr.search.ExtendedDismaxQParser.parse(ExtendedDismaxQParser.java:175) at org.apache.solr.search.QParser.getQuery(QParser.java:142)
[jira] [Commented] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864475#comment-13864475 ] Mark Miller commented on SOLR-5615: --- Even with the other changes, I like the idea of using a background thread because I don't think it's right that we do that whole reconnect process before we set that we are connected to zk and get out of the connection manager. I really don't think that process should hold up the connection manager at all - it's meant to just trigger it. Deadlock while trying to recover after a ZK session expiry -- Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 4.6 Reporter: Ramkumar Aiyengar Assignee: Mark Miller Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5615.patch, SOLR-5615.patch The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for the shard - OL tries to register itself as a replica, and waits till the cluster state is updated with the new leader from event thread - ZK sends a watch update to OL, but it is blocked on the event thread waiting for it. Oops. This finally breaks out after trying to register itself as replica times out after 20 mins. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5617) Default classloader restrictions may be too tight
Shawn Heisey created SOLR-5617: -- Summary: Default classloader restrictions may be too tight Key: SOLR-5617 URL: https://issues.apache.org/jira/browse/SOLR-5617 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: Shawn Heisey Fix For: 5.0, 4.7 SOLR-4882 introduced restrictions for the Solr class loader that cause resources outside the instanceDir to fail to load. This is a very good goal, but it also causes resources in ${solr.solr.home}/lib to fail to load. In order to get those jars to work, I must turn off all SOLR-4882 safety checking. I can understand not wanting to load resources from an arbitrary path, but ${solr.solr.home} and its children should be about as trustworthy as instanceDir. Ideally I'd like to have ${solr.solr.home}/lib trusted automatically, since it is searched automatically. If I need to define a system property to make this happen, I'm OK with that -- as long as I don't have to turn off the safety checking entirely. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864491#comment-13864491 ] Ramkumar Aiyengar commented on SOLR-5615: - Fair enough. Would that allow multiple onReconnect.command () invocations to run simultaneously, and is that fine? (on mobile, so my reading of the patch could be wrong) What if we were in the process of recovering when we were unfortunate enough to get a second expiry thereby bringing all nodes down? Deadlock while trying to recover after a ZK session expiry -- Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 4.6 Reporter: Ramkumar Aiyengar Assignee: Mark Miller Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5615.patch, SOLR-5615.patch The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for the shard - OL tries to register itself as a replica, and waits till the cluster state is updated with the new leader from event thread - ZK sends a watch update to OL, but it is blocked on the event thread waiting for it. Oops. This finally breaks out after trying to register itself as replica times out after 20 mins. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5244) Full Search Result Export
[ https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864496#comment-13864496 ] Mikhail Khludnev commented on SOLR-5244: bq. 1) Add a special cache that speeds up the docId- bytesRef lookup. This would be a segment level cache of the top N terms (by frequency) in the index. The cache would be a simple int to BytesRef hashmap, mapping the segment level ord to the bytesRef that's exactly what you've got on FieldCache.DEFAULT.getTerms() for an indexed field without docvalues enabled. See. FieldCacheImpl.BinaryDocValuesCache Full Search Result Export - Key: SOLR-5244 URL: https://issues.apache.org/jira/browse/SOLR-5244 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Fix For: 5.0 Attachments: SOLR-5244.patch It would be great if Solr could efficiently export entire search result sets without scoring or ranking documents. This would allow external systems to perform rapid bulk imports from Solr. It also provides a possible platform for exporting results to support distributed join scenarios within Solr. This ticket provides a patch that has two pluggable components: 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with document results and does not delegate to ranking collectors. Instead it puts the BitSet on the request context. 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints the entire result as a binary stream. A header is provided at the beginning of the stream so external clients can self configure. Note: These two components will be sufficient for a non-distributed environment. For distributed export a new Request handler will need to be developed. After applying the patch and building the dist or example, you can register the components through the following changes to solrconfig.xml Register export contrib libraries: lib dir=../../../dist/ regex=solr-export-\d.*\.jar / Register the export queryParser with the following line: queryParser name=export class=org.apache.solr.export.ExportQParserPlugin/ Register the xbin writer: queryResponseWriter name=xbin class=org.apache.solr.export.BinaryExportWriter/ The following query will perform the export: {code} http://localhost:8983/solr/collection1/select?q=*:*fq={!export}wt=xbinfl=join_i {code} Initial patch supports export of four data-types: 1) Single value trie int, long and float 2) Binary doc values. The numerics are currently exported from the FieldCache and the Binary doc values can be in memory or on disk. Since this is designed to export very large result sets efficiently, stored fields are not used for the export. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5617) Default classloader restrictions may be too tight
[ https://issues.apache.org/jira/browse/SOLR-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-5617: --- Description: SOLR-4882 introduced restrictions for the Solr class loader that cause resources outside the instanceDir to fail to load. This is a very good goal, but it also causes resources in $\{solr.solr.home\}/lib to fail to load. In order to get those jars to work, I must turn off all SOLR-4882 safety checking. I can understand not wanting to load resources from an arbitrary path, but the solr home and its children should be about as trustworthy as instanceDir. Ideally I'd like to have $\{solr.solr.home\}/lib trusted automatically, since it is searched automatically. If I need to define a system property to make this happen, I'm OK with that -- as long as I don't have to turn off the safety checking entirely. was: SOLR-4882 introduced restrictions for the Solr class loader that cause resources outside the instanceDir to fail to load. This is a very good goal, but it also causes resources in ${solr.solr.home}/lib to fail to load. In order to get those jars to work, I must turn off all SOLR-4882 safety checking. I can understand not wanting to load resources from an arbitrary path, but ${solr.solr.home} and its children should be about as trustworthy as instanceDir. Ideally I'd like to have ${solr.solr.home}/lib trusted automatically, since it is searched automatically. If I need to define a system property to make this happen, I'm OK with that -- as long as I don't have to turn off the safety checking entirely. Default classloader restrictions may be too tight - Key: SOLR-5617 URL: https://issues.apache.org/jira/browse/SOLR-5617 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: Shawn Heisey Labels: security Fix For: 5.0, 4.7 SOLR-4882 introduced restrictions for the Solr class loader that cause resources outside the instanceDir to fail to load. This is a very good goal, but it also causes resources in $\{solr.solr.home\}/lib to fail to load. In order to get those jars to work, I must turn off all SOLR-4882 safety checking. I can understand not wanting to load resources from an arbitrary path, but the solr home and its children should be about as trustworthy as instanceDir. Ideally I'd like to have $\{solr.solr.home\}/lib trusted automatically, since it is searched automatically. If I need to define a system property to make this happen, I'm OK with that -- as long as I don't have to turn off the safety checking entirely. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864502#comment-13864502 ] Mark Miller commented on SOLR-5615: --- Yeah, I've been considered the same thing. My inclination was it was okay, but we may have to add something to cancel our leader election before joining the election to be sure. Deadlock while trying to recover after a ZK session expiry -- Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 4.6 Reporter: Ramkumar Aiyengar Assignee: Mark Miller Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5615.patch, SOLR-5615.patch The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for the shard - OL tries to register itself as a replica, and waits till the cluster state is updated with the new leader from event thread - ZK sends a watch update to OL, but it is blocked on the event thread waiting for it. Oops. This finally breaks out after trying to register itself as replica times out after 20 mins. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5617) Default classloader restrictions may be too tight
[ https://issues.apache.org/jira/browse/SOLR-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864505#comment-13864505 ] Shawn Heisey commented on SOLR-5617: I will have to double-check, but I probably have the specifics that required me to turn off the safety checking wrong. It may have been configuration components gathered via xinclude, not jarfiles. Either way, I am sure that everything is under the solr home. Default classloader restrictions may be too tight - Key: SOLR-5617 URL: https://issues.apache.org/jira/browse/SOLR-5617 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: Shawn Heisey Labels: security Fix For: 5.0, 4.7 SOLR-4882 introduced restrictions for the Solr class loader that cause resources outside the instanceDir to fail to load. This is a very good goal, but it also causes resources in $\{solr.solr.home\}/lib to fail to load. In order to get those jars to work, I must turn off all SOLR-4882 safety checking. I can understand not wanting to load resources from an arbitrary path, but the solr home and its children should be about as trustworthy as instanceDir. Ideally I'd like to have $\{solr.solr.home\}/lib trusted automatically, since it is searched automatically. If I need to define a system property to make this happen, I'm OK with that -- as long as I don't have to turn off the safety checking entirely. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-5611) When documents are uniformly distributed over shards, enable returning approximated results in distributed query
[ https://issues.apache.org/jira/browse/SOLR-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Isaac Hebsh closed SOLR-5611. - Resolution: Not A Problem Oops. I missed the {{shards.rows}} parameter. When documents are uniformly distributed over shards, enable returning approximated results in distributed query Key: SOLR-5611 URL: https://issues.apache.org/jira/browse/SOLR-5611 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Isaac Hebsh Labels: distributed_search, shard, solrcloud Fix For: 4.7 Query with rows=1000, which sent to a collection of 100 shards (shard key behaviour is default - based on hash of the unique key), will generate 100 requests of rows=1000, on each shard. This results to total number of rows*numShards unique keys to be retrieved. This behaviour is getting worst as numShards grows. If the documents are uniformly distributed over the shards, the expected number of document should be ~ rows/numShards. Obviously, there might be extreme cases, when all of the top X documents are in a specific shard. I suggest adding an optional parameter, say approxResults=true, which decides whether we should limit the rows in the shard requests to rows/numShardsor not. Moreover, we can add a numeric parameter which increases the limit, to be more accurate. For example, the query {{approxResults=trueapproxResults.factor=1.5}} will retrieve 1.5*rows/numShards from each shard. In the case of 100 shards and rows=1000, each shard will return 15 documents. Furthermore, this can reduce the problem of deep paging, because the same thing can be applied there. when requested start=10, Solr creating shard request with start=0 and rows=START+ROWS. In the approximated approach, start parameter (in the shard requests) can be set to 10/numShards. The idea of the approxResults.factor creates some difficulties here, though. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5560) Enable LocalParams without escaping the query
[ https://issues.apache.org/jira/browse/SOLR-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864553#comment-13864553 ] Isaac Hebsh commented on SOLR-5560: --- Hi [~ryancutter], thank you a lot! I'm not familiar with parser states (thank god), so I can't review the patch. What action is should be performed in order to commit this patch? (into 4.7?) Enable LocalParams without escaping the query - Key: SOLR-5560 URL: https://issues.apache.org/jira/browse/SOLR-5560 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.6 Reporter: Isaac Hebsh Fix For: 4.7, 4.6.1 Attachments: SOLR-5560.patch This query should be a legit syntax: http://localhost:8983/solr/collection1/select?debugQuery=truedefType=lucenedf=idq=TERM1 AND {!lucene df=text}(TERM2 TERM3 TERM4 TERM5) currently it isn't, because the LocalParams can be specified on a single term only. [~billnbell] thinks it is a bug. From the mailing list: {quote} We want to set a LocalParam on a nested query. When quering with v inline parameter, it works fine: http://localhost:8983/solr/collection1/select?debugQuery=truedefType=lucenedf=idq=TERM1 AND {!lucene df=text v=TERM2 TERM3 \TERM4 TERM5\} the parsedquery_toString is +id:TERM1 +(text:term2 text:term3 text:term4 term5) Query using the _query_ also works fine: http://localhost:8983/solr/collection1/select?debugQuery=truedefType=lucenedf=idq=TERM1 AND _query_:{!lucene df=text}TERM2 TERM3 \TERM4 TERM5\ (parsedquery is exactly the same). Obviously, there is the option of external parameter ({... v=$nestedq}nestedq=...) This is a good solution, but it is not practical, when having a lot of such nested queries. BUT, when trying to put the nested query in place, it yields syntax error: http://localhost:8983/solr/collection1/select?debugQuery=truedefType=lucenedf=idq=TERM1 AND {!lucene df=text}(TERM2 TERM3 TERM4 TERM5) org.apache.solr.search.SyntaxError: Cannot parse '(TERM2' The previous options are less preferred, because the escaping that should be made on the nested query. {quote} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry
[ https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5615: -- Attachment: SOLR-5615.patch Another rev that adds what I think is a decent change anyway - before joining an election, cancel any known previous election participation. Deadlock while trying to recover after a ZK session expiry -- Key: SOLR-5615 URL: https://issues.apache.org/jira/browse/SOLR-5615 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 4.6 Reporter: Ramkumar Aiyengar Assignee: Mark Miller Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5615.patch, SOLR-5615.patch, SOLR-5615.patch The sequence of events which might trigger this is as follows: - Leader of a shard, say OL, has a ZK expiry - The new leader, NL, starts the election process - NL, through Overseer, clears the current leader (OL) for the shard from the cluster state - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread) - OL marks itself down - OL sets up watches for cluster state, and then retrieves it (with no leader for this shard) - NL, through Overseer, updates cluster state to mark itself leader for the shard - OL tries to register itself as a replica, and waits till the cluster state is updated with the new leader from event thread - ZK sends a watch update to OL, but it is blocked on the event thread waiting for it. Oops. This finally breaks out after trying to register itself as replica times out after 20 mins. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5594) Enable using extended field types with prefix queries for non-default encoded strings
[ https://issues.apache.org/jira/browse/SOLR-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864579#comment-13864579 ] Hoss Man commented on SOLR-5594: * Aren't there other parsers classes that will need similar changes? (PrefixQParserPlugin, SimplerQParserPlugin at a minimum i think) * I think your new FieldType.getPrefixQuery method has a back compat break for any existing FieldTypes that people might be using because it now calls readableToIndexed ... that smells like it could break things for some FieldTypes ... but maybe i'm missing something? * FieldType.getPrefixQuery has lots of bogus cut/pasted javadocs from getRangeQuery * Can't your MyIndexedBinaryField just subclass BinaryField to reduce some code? for that matter: is there any reason why we shouldn't just make BinaryField implement prefix queries in the way your MyIndexedBinaryField does? * i'm not sure i understand why you need BinaryTokenStream for the test (see previous comment about just extending/improving BinaryField) but if so perhaps it should be moved from lucene/core to lucene/test-framework? Enable using extended field types with prefix queries for non-default encoded strings - Key: SOLR-5594 URL: https://issues.apache.org/jira/browse/SOLR-5594 Project: Solr Issue Type: Improvement Components: query parsers, Schema and Analysis Affects Versions: 4.6 Reporter: Anshum Gupta Assignee: Anshum Gupta Priority: Minor Attachments: SOLR-5594-branch_4x.patch, SOLR-5594.patch Enable users to be able to use prefix query with custom field types with non-default encoding/decoding for queries more easily. e.g. having a custom field work with base64 encoded query strings. Currently, the workaround for it is to have the override at getRewriteMethod level. Perhaps having the prefixQuery also use the calling FieldType's readableToIndexed method would work better. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5361) FVH throws away some boosts
[ https://issues.apache.org/jira/browse/LUCENE-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864583#comment-13864583 ] Adrien Grand commented on LUCENE-5361: -- Thanks Nik, your fix looks good! I don't think cloning the queries is an issue, it happens all the time when doing rewrites, and it's definitely better than modifying those queries in-place. I'll commit it tomorrow if there is no objection. FVH throws away some boosts --- Key: LUCENE-5361 URL: https://issues.apache.org/jira/browse/LUCENE-5361 Project: Lucene - Core Issue Type: Bug Reporter: Nik Everett Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-5361.patch The FVH's FieldQuery throws away some boosts when flattening queries, including DisjunctionMaxQuery and BooleanQuery queries. Fragments generated against queries containing boosted boolean queries don't end up sorted correctly. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Pull requests versus JIRAta
Further adventures in token streams have motivated me to play tech writer some more. Options: 1. just create github pull requests. 2. reopen prior jira 3. make new jira preference? - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Pull requests versus JIRAta
I think 1 or 3 is best. The downside of 2 is just the confusion, since the other doc was good, i dont think we have to reopen it. i cant imagine anyone worried about having too many jiras with documentation fixes! On Tue, Jan 7, 2014 at 3:21 PM, Benson Margulies bimargul...@gmail.com wrote: Further adventures in token streams have motivated me to play tech writer some more. Options: 1. just create github pull requests. 2. reopen prior jira 3. make new jira preference? - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Pull requests versus JIRAta
OK. Hopefully this time I'll remember to watch my own JIRA so that I don't ignore Uwe. On Tue, Jan 7, 2014 at 3:24 PM, Robert Muir rcm...@gmail.com wrote: I think 1 or 3 is best. The downside of 2 is just the confusion, since the other doc was good, i dont think we have to reopen it. i cant imagine anyone worried about having too many jiras with documentation fixes! On Tue, Jan 7, 2014 at 3:21 PM, Benson Margulies bimargul...@gmail.com wrote: Further adventures in token streams have motivated me to play tech writer some more. Options: 1. just create github pull requests. 2. reopen prior jira 3. make new jira preference? - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: The Old Git Discussion
I've followed this thread with interest, and although I'm (sadly) a lapsed Apache committer (not Lucene/Solr), I finally had to comment as I've just gone through the pain of learning git after many happy years with svn. In my long experience in IT I've learned one incontrovertible fact: most times, the technical merits of one technology over another are not nearly as important as everyone thinks. It is really all about how WELL you use a given technology to get the job done. The stuff I do in git now, I could do in SVN, and vice versa. I'd wager I could do the same in CVS or even older technologies. It like Ant versus Maven versus Gradle. I can do the same in each of these. Each has their own good and bad points. I'll stick with Ant and SVN to the end but hey, if a client works only with Gradle and Git and XYZ technology and has an intellectual investment there, I'm not gonna argue the point on technical merits. That being said, I think the worst argument one could make about anything is that we should move to it because everyone else is. People will flock to fads as much (I could argue: more) than to genuine technical improvements (anyone remember the 70s? 80s? 90s?). Git feels a bit faddish to me, and is definitely immature. I get some of the advantages, but I don't think I should have to be a gitk expert to use the damn software - its over-engineered and actually opens up the door to more convoluted development processes. Whether Git is a fad or not, the issue, as pointed out below, is supporting the way contributors work. The win-win situation would be to keep the core based on SVN but support git contributions (as I know someone else suggested). SVN is a technology that is stable and which all core committers know like the back of their hands - no sense in wasting time learning git when people are donating time and that time is better spent on JIRAs. What I don't know is how this GIT integration would work, but I'd hope it could be done. Just to push home the point, I'll bet most of us who have been around a while have plenty of stories of IT shops moving from one technology to another ... and then in a few years to another ... and then to another - all because some manager got a burr up his rear or was wined and dined by a vendor. Why? Why hurt productivity for the sake of keep up with the times? How about setting an example of sticking with what works despite the made rush to github? My €.02. Lajos Moczar On 06/01/2014 17:01, Robert Muir wrote: On Sun, Jan 5, 2014 at 12:07 PM, Mark Miller markrmil...@gmail.com wrote: My point here is not really to discuss the merits of Git VS SVN on a feature / interface basis. We might as well talk about MySQL vs Postgres. Personally, I prefer GIT. It feels good when I use it. SVN feels like crap. That doesn't make me want to move. I've used SVN for years with Lucene/Solr, and like everyone, it's pretty much second nature. The problem is the world is moving. It may not be clear to everyone yet, but give it a bit more time and it will be. Git already owns the open source world. It rivals SVN by most guesses in the proprietary world. This is a strong hard trend. The same trend that saw SVN eat CVS. I think clearly, a distributed version control system will dominate. And clearly Git has won. I'm not ready to call a vote, because I don't think it's critical we switch yet. But I wanted to continue the discussion, as obviously, plenty of it will be needed over time before we made such a switch. It's not about one thing being better than the other. It's about using what everyone else uses so you don't provide a barrier to contribution. It's about the post I linked to when I started this thread. I personally don't care about pull requests and Github. I don't think any of it's features are that great, other than it acts as a central repo. Git is not good because of Github IMO. But Git and Github are eating the world. Most of the patches I have processed now are made against Git. Jumping from SVN to Git and back is very annoying IMO though. There are plenty of tools and workflows for it and they all suck. Anyway, as the trend continues, it will become even more obvious that Lucene/Solr will start looking stale on SVN. We have enough image problems in terms of being modern at Apache. We will need to manage the ones we can. We should not choose the tools that simply make us fuzzy and comfortable. We should choose the tools that are best for the project and future contributions in the long term. - Mark The idea that this has anything to do with contributors is misleading. Today contributors can use either SVN or GIT. They have their choice. How can it be any better than that for contributors? As demonstrated over the weekend, its also possible today for contributors to use svn+jira or git+pull request workflow. As i said earlier, why not spend our time trying to make it easier on contributors and support
Re: Iterating BinaryDocValues
Going sequentially should help, if the pages are not hot (in the OS's IO cache). You can also use a different DVFormat, e.g. Direct, but this holds all bytes in RAM. Mike McCandless http://blog.mikemccandless.com On Tue, Jan 7, 2014 at 1:09 PM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: Joel, I tried to hack it straightforwardly, but found no free gain there. The only attempt I can suggest is to try to reuse bytes in https://github.com/apache/lucene-solr/blame/trunk/lucene/core/src/java/org/apache/lucene/codecs/lucene45/Lucene45DocValuesProducer.java#L401 right now it allocates bytes every time, which beside of GC can also impact memory access locality. Could you try fix memory waste and repeat performance test? Have a good hack! On Mon, Dec 23, 2013 at 9:51 PM, Joel Bernstein joels...@gmail.com wrote: Hi, I'm looking for a faster way to perform large scale docId - bytesRef lookups for BinaryDocValues. I'm finding that I can't get the performance that I need from the random access seek in the BinaryDocValues interface. I'm wondering if sequentially scanning the docValues would be a faster approach. I have a BitSet of matching docs, so if I sequentially moved through the docValues I could test each one against that bitset. Wondering if that approach would be faster for bulk extracts and how tricky it would be to add an iterator to the BinaryDocValues interface? Thanks, Joel -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5354) Blended score in AnalyzingInfixSuggester
[ https://issues.apache.org/jira/browse/LUCENE-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864683#comment-13864683 ] Michael McCandless commented on LUCENE-5354: Woops, sorry, this fell below the event horizon of my TODO list. I'll look at your new patch soon. There is an existing performance test, LookupBenchmarkTest, but it's a bit tricky to run. See the comment on LUCENE-5030: https://issues.apache.org/jira/browse/LUCENE-5030?focusedCommentId=13689155page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13689155 Blended score in AnalyzingInfixSuggester Key: LUCENE-5354 URL: https://issues.apache.org/jira/browse/LUCENE-5354 Project: Lucene - Core Issue Type: Improvement Components: modules/spellchecker Affects Versions: 4.4 Reporter: Remi Melisson Priority: Minor Labels: suggester Attachments: LUCENE-5354.patch, LUCENE-5354_2.patch I'm working on a custom suggester derived from the AnalyzingInfix. I require what is called a blended score (//TODO ln.399 in AnalyzingInfixSuggester) to transform the suggestion weights depending on the position of the searched term(s) in the text. Right now, I'm using an easy solution : If I want 10 suggestions, then I search against the current ordered index for the 100 first results and transform the weight : bq. a) by using the term position in the text (found with TermVector and DocsAndPositionsEnum) or bq. b) by multiplying the weight by the score of a SpanQuery that I add when searching and return the updated 10 most weighted suggestions. Since we usually don't need to suggest so many things, the bigger search + rescoring overhead is not so significant but I agree that this is not the most elegant solution. We could include this factor (here the position of the term) directly into the index. So, I can contribute to this if you think it's worth adding it. Do you think I should tweak AnalyzingInfixSuggester, subclass it or create a dedicated class ? -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5244) Full Search Result Export
[ https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864689#comment-13864689 ] Joel Bernstein commented on SOLR-5244: -- I'll do some testing of the performance of this. Unless I'm missing something though, it looks like you have go through a PagedBytes.Reader, PackedInts.Reader to get the BytesRef. I think would perform with similar performance to the in memory BinaryDocValues I was using for my initial test. The cache I was thinking of building would be backed by hppc IntObjectOpenHashMap, which I should been able to do 10 million+ read operations per second. Full Search Result Export - Key: SOLR-5244 URL: https://issues.apache.org/jira/browse/SOLR-5244 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Fix For: 5.0 Attachments: SOLR-5244.patch It would be great if Solr could efficiently export entire search result sets without scoring or ranking documents. This would allow external systems to perform rapid bulk imports from Solr. It also provides a possible platform for exporting results to support distributed join scenarios within Solr. This ticket provides a patch that has two pluggable components: 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with document results and does not delegate to ranking collectors. Instead it puts the BitSet on the request context. 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints the entire result as a binary stream. A header is provided at the beginning of the stream so external clients can self configure. Note: These two components will be sufficient for a non-distributed environment. For distributed export a new Request handler will need to be developed. After applying the patch and building the dist or example, you can register the components through the following changes to solrconfig.xml Register export contrib libraries: lib dir=../../../dist/ regex=solr-export-\d.*\.jar / Register the export queryParser with the following line: queryParser name=export class=org.apache.solr.export.ExportQParserPlugin/ Register the xbin writer: queryResponseWriter name=xbin class=org.apache.solr.export.BinaryExportWriter/ The following query will perform the export: {code} http://localhost:8983/solr/collection1/select?q=*:*fq={!export}wt=xbinfl=join_i {code} Initial patch supports export of four data-types: 1) Single value trie int, long and float 2) Binary doc values. The numerics are currently exported from the FieldCache and the Binary doc values can be in memory or on disk. Since this is designed to export very large result sets efficiently, stored fields are not used for the export. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: The Old Git Discussion
I don’t really buy the fad argument, but as I’ve said, I’m willing to wait a little longer for others to catch on. I try and follow the stats and reports and articles on this pretty closely. As I mentioned early in the thread, by all appearances, the shift from SVN to GIT looks much like the shift from CVS to SVN. This was not a fad change, nor is the next mass movement likely to be. Just like no one starts a project on CVS anymore, we are almost already to the point where new projects start exclusive on GIT - especially open source. I’m happy to sit back and watch the trend continue though. The number of GIT users in the committee and among the committers only grows every time the discussion comes up. If this was 2009, 2010, 2011 … who knows, perhaps I would buy some fad argument. But it just doesn’t jive in 2014. - Mark On Jan 7, 2014, at 3:33 PM, Lajos la...@protulae.com wrote: I've followed this thread with interest, and although I'm (sadly) a lapsed Apache committer (not Lucene/Solr), I finally had to comment as I've just gone through the pain of learning git after many happy years with svn. In my long experience in IT I've learned one incontrovertible fact: most times, the technical merits of one technology over another are not nearly as important as everyone thinks. It is really all about how WELL you use a given technology to get the job done. The stuff I do in git now, I could do in SVN, and vice versa. I'd wager I could do the same in CVS or even older technologies. It like Ant versus Maven versus Gradle. I can do the same in each of these. Each has their own good and bad points. I'll stick with Ant and SVN to the end but hey, if a client works only with Gradle and Git and XYZ technology and has an intellectual investment there, I'm not gonna argue the point on technical merits. That being said, I think the worst argument one could make about anything is that we should move to it because everyone else is. People will flock to fads as much (I could argue: more) than to genuine technical improvements (anyone remember the 70s? 80s? 90s?). Git feels a bit faddish to me, and is definitely immature. I get some of the advantages, but I don't think I should have to be a gitk expert to use the damn software - its over-engineered and actually opens up the door to more convoluted development processes. Whether Git is a fad or not, the issue, as pointed out below, is supporting the way contributors work. The win-win situation would be to keep the core based on SVN but support git contributions (as I know someone else suggested). SVN is a technology that is stable and which all core committers know like the back of their hands - no sense in wasting time learning git when people are donating time and that time is better spent on JIRAs. What I don't know is how this GIT integration would work, but I'd hope it could be done. Just to push home the point, I'll bet most of us who have been around a while have plenty of stories of IT shops moving from one technology to another ... and then in a few years to another ... and then to another - all because some manager got a burr up his rear or was wined and dined by a vendor. Why? Why hurt productivity for the sake of keep up with the times? How about setting an example of sticking with what works despite the made rush to github? My €.02. Lajos Moczar On 06/01/2014 17:01, Robert Muir wrote: On Sun, Jan 5, 2014 at 12:07 PM, Mark Miller markrmil...@gmail.com wrote: My point here is not really to discuss the merits of Git VS SVN on a feature / interface basis. We might as well talk about MySQL vs Postgres. Personally, I prefer GIT. It feels good when I use it. SVN feels like crap. That doesn't make me want to move. I've used SVN for years with Lucene/Solr, and like everyone, it's pretty much second nature. The problem is the world is moving. It may not be clear to everyone yet, but give it a bit more time and it will be. Git already owns the open source world. It rivals SVN by most guesses in the proprietary world. This is a strong hard trend. The same trend that saw SVN eat CVS. I think clearly, a distributed version control system will dominate. And clearly Git has won. I'm not ready to call a vote, because I don't think it's critical we switch yet. But I wanted to continue the discussion, as obviously, plenty of it will be needed over time before we made such a switch. It's not about one thing being better than the other. It's about using what everyone else uses so you don't provide a barrier to contribution. It's about the post I linked to when I started this thread. I personally don't care about pull requests and Github. I don't think any of it's features are that great, other than it acts as a central repo. Git is not good because of Github IMO. But Git and Github are eating the world. Most of the patches I have processed now are
[jira] [Updated] (SOLR-5617) Default classloader restrictions may be too tight
[ https://issues.apache.org/jira/browse/SOLR-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-5617: --- Description: SOLR-4882 introduced restrictions for the Solr class loader that cause resources outside the instanceDir to fail to load. This is a very good goal, but what if you have common resources like included config files that are outside instanceDir but are still fully inside the solr home? I can understand not wanting to load resources from an arbitrary path, but the solr home and its children should be about as trustworthy as instanceDir. Ideally I'd like to have anything that's in $\{solr.solr.home\} trusted automatically. If I need to define a system property to make this happen, I'm OK with that -- as long as I don't have to turn off the safety checking entirely. was: SOLR-4882 introduced restrictions for the Solr class loader that cause resources outside the instanceDir to fail to load. This is a very good goal, but it also causes resources in $\{solr.solr.home\}/lib to fail to load. In order to get those jars to work, I must turn off all SOLR-4882 safety checking. I can understand not wanting to load resources from an arbitrary path, but the solr home and its children should be about as trustworthy as instanceDir. Ideally I'd like to have $\{solr.solr.home\}/lib trusted automatically, since it is searched automatically. If I need to define a system property to make this happen, I'm OK with that -- as long as I don't have to turn off the safety checking entirely. Default classloader restrictions may be too tight - Key: SOLR-5617 URL: https://issues.apache.org/jira/browse/SOLR-5617 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: Shawn Heisey Labels: security Fix For: 5.0, 4.7 SOLR-4882 introduced restrictions for the Solr class loader that cause resources outside the instanceDir to fail to load. This is a very good goal, but what if you have common resources like included config files that are outside instanceDir but are still fully inside the solr home? I can understand not wanting to load resources from an arbitrary path, but the solr home and its children should be about as trustworthy as instanceDir. Ideally I'd like to have anything that's in $\{solr.solr.home\} trusted automatically. If I need to define a system property to make this happen, I'm OK with that -- as long as I don't have to turn off the safety checking entirely. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5617) Default classloader restrictions may be too tight
[ https://issues.apache.org/jira/browse/SOLR-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864505#comment-13864505 ] Shawn Heisey edited comment on SOLR-5617 at 1/7/14 9:44 PM: Here's a stacktrace from my attempted start on 4.6.0 without the option to allow unsafe resource loading. The solr home is /index/solr4: {noformat} ERROR - 2014-01-07 14:37:05.493; org.apache.solr.common.SolrException; null:org.apache.solr.common.SolrException: SolrCore 's1build' is not available due to init failure: Could not load config file /index/solr4/cores/s1_0/solrconfig.xml at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:825) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:293) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1476) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:982) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1043) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:724) Caused by: org.apache.solr.common.SolrException: Could not load config file /index/solr4/cores/s1_0/solrconfig.xml at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:532) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:599) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:253) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:245) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ... 1 more Caused by: org.apache.solr.common.SolrException: org.xml.sax.SAXParseException; systemId: solrres:/solrconfig.xml; lineNumber: 7; columnNumber: 70; An include with href '../../../config/common/luceneMatchVersion.xml'failed, and no fallback element was found. at org.apache.solr.core.Config.init(Config.java:148) at org.apache.solr.core.Config.init(Config.java:86) at org.apache.solr.core.SolrConfig.init(SolrConfig.java:129) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:529) ... 11 more Caused by: org.xml.sax.SAXParseException; systemId: solrres:/solrconfig.xml; lineNumber: 7; columnNumber: 70; An include with href '../../../config/common/luceneMatchVersion.xml'failed,
[jira] [Created] (LUCENE-5388) Eliminate construction over readers for Tokenizer
Benson Margulies created LUCENE-5388: Summary: Eliminate construction over readers for Tokenizer Key: LUCENE-5388 URL: https://issues.apache.org/jira/browse/LUCENE-5388 Project: Lucene - Core Issue Type: Improvement Components: core/other Reporter: Benson Margulies In the modern world, Tokenizers are intended to be reusable, with input supplied via #setReader. The constructors that take Reader are a vestige. Worse yet, they invite people to make mistakes in handling the reader that tangle them up with the state machine in Tokenizer. The sensible thing is to eliminate these ctors, and force setReader usage. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5389) Even more doc for construction of TokenStream components
Benson Margulies created LUCENE-5389: Summary: Even more doc for construction of TokenStream components Key: LUCENE-5389 URL: https://issues.apache.org/jira/browse/LUCENE-5389 Project: Lucene - Core Issue Type: Improvement Reporter: Benson Margulies There are more useful things to tell would-be authors of tokenizers. Let's tell them. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5170) Spatial multi-value distance sort via DocValues
[ https://issues.apache.org/jira/browse/SOLR-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated SOLR-5170: -- Attachment: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch.txt Adds recipDistance scoring, lat/long is one param. Spatial multi-value distance sort via DocValues --- Key: SOLR-5170 URL: https://issues.apache.org/jira/browse/SOLR-5170 Project: Solr Issue Type: New Feature Components: spatial Reporter: David Smiley Assignee: David Smiley Attachments: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch, SOLR-5170_spatial_multi-value_sort_via_docvalues.patch.txt The attached patch implements spatial multi-value distance sorting. In other words, a document can have more than one point per field, and using a provided function query, it will return the distance to the closest point. The data goes into binary DocValues, and as-such it's pretty friendly to realtime search requirements, and it only uses 8 bytes per point. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5170) Spatial multi-value distance sort via DocValues
[ https://issues.apache.org/jira/browse/SOLR-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864738#comment-13864738 ] Jeff Wartes commented on SOLR-5170: --- I've been using this patch with some minor tweaks and solr 4.3.1 in production for about six months now. Since I was applying it again against 4.6 this morning, I figured I should attach my tweaks, and mention it passes tests against 4.6. This does NOT address the design issues David raises in the initial comment. The changes vs the initial patchfile allow it to be applied against a greater range of solr versions, and brings it a little closer to feeling the same as geofilt's params. Spatial multi-value distance sort via DocValues --- Key: SOLR-5170 URL: https://issues.apache.org/jira/browse/SOLR-5170 Project: Solr Issue Type: New Feature Components: spatial Reporter: David Smiley Assignee: David Smiley Attachments: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch The attached patch implements spatial multi-value distance sorting. In other words, a document can have more than one point per field, and using a provided function query, it will return the distance to the closest point. The data goes into binary DocValues, and as-such it's pretty friendly to realtime search requirements, and it only uses 8 bytes per point. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5617) Default classloader restrictions may be too tight
[ https://issues.apache.org/jira/browse/SOLR-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864741#comment-13864741 ] Shawn Heisey commented on SOLR-5617: I have figured out a workaround. I've got a config structure that heavily uses xinclude and symlinks. By changing things around so that only the symlinks traverse upwards and xinclude only refers to local files, I no longer need to enable unsafe loading. I still think that it would be useful to fix this issue, but the urgency is gone. Default classloader restrictions may be too tight - Key: SOLR-5617 URL: https://issues.apache.org/jira/browse/SOLR-5617 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: Shawn Heisey Labels: security Fix For: 5.0, 4.7 SOLR-4882 introduced restrictions for the Solr class loader that cause resources outside the instanceDir to fail to load. This is a very good goal, but what if you have common resources like included config files that are outside instanceDir but are still fully inside the solr home? I can understand not wanting to load resources from an arbitrary path, but the solr home and its children should be about as trustworthy as instanceDir. Ideally I'd like to have anything that's in $\{solr.solr.home\} trusted automatically. If I need to define a system property to make this happen, I'm OK with that -- as long as I don't have to turn off the safety checking entirely. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5617) Default classloader restrictions may be too tight
[ https://issues.apache.org/jira/browse/SOLR-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-5617: --- Priority: Minor (was: Major) Default classloader restrictions may be too tight - Key: SOLR-5617 URL: https://issues.apache.org/jira/browse/SOLR-5617 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: Shawn Heisey Priority: Minor Labels: security Fix For: 5.0, 4.7 SOLR-4882 introduced restrictions for the Solr class loader that cause resources outside the instanceDir to fail to load. This is a very good goal, but what if you have common resources like included config files that are outside instanceDir but are still fully inside the solr home? I can understand not wanting to load resources from an arbitrary path, but the solr home and its children should be about as trustworthy as instanceDir. Ideally I'd like to have anything that's in $\{solr.solr.home\} trusted automatically. If I need to define a system property to make this happen, I'm OK with that -- as long as I don't have to turn off the safety checking entirely. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5388) Eliminate construction over readers for Tokenizer
[ https://issues.apache.org/jira/browse/LUCENE-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864742#comment-13864742 ] Robert Muir commented on LUCENE-5388: - +1, its really silly its this way. I guess its the right thing to do this for 5.0 only: i wish we had done it for 4.0, but it is what it is. Should be a rather large and noisy change unfortunately. I can help, let me know. Eliminate construction over readers for Tokenizer - Key: LUCENE-5388 URL: https://issues.apache.org/jira/browse/LUCENE-5388 Project: Lucene - Core Issue Type: Improvement Components: core/other Reporter: Benson Margulies In the modern world, Tokenizers are intended to be reusable, with input supplied via #setReader. The constructors that take Reader are a vestige. Worse yet, they invite people to make mistakes in handling the reader that tangle them up with the state machine in Tokenizer. The sensible thing is to eliminate these ctors, and force setReader usage. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5244) Full Search Result Export
[ https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864689#comment-13864689 ] Joel Bernstein edited comment on SOLR-5244 at 1/7/14 10:12 PM: --- I'll do some testing of the performance of this. Unless I'm missing something though, it looks like you have go through a PagedBytes.Reader, PackedInts.Reader to get the BytesRef. I think would have similar performance to the in memory BinaryDocValues I was using for my initial test. The cache I was thinking of building would be backed by hppc IntObjectOpenHashMap, which I should been able to do 10 million+ read operations per second. was (Author: joel.bernstein): I'll do some testing of the performance of this. Unless I'm missing something though, it looks like you have go through a PagedBytes.Reader, PackedInts.Reader to get the BytesRef. I think would perform with similar performance to the in memory BinaryDocValues I was using for my initial test. The cache I was thinking of building would be backed by hppc IntObjectOpenHashMap, which I should been able to do 10 million+ read operations per second. Full Search Result Export - Key: SOLR-5244 URL: https://issues.apache.org/jira/browse/SOLR-5244 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Fix For: 5.0 Attachments: SOLR-5244.patch It would be great if Solr could efficiently export entire search result sets without scoring or ranking documents. This would allow external systems to perform rapid bulk imports from Solr. It also provides a possible platform for exporting results to support distributed join scenarios within Solr. This ticket provides a patch that has two pluggable components: 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with document results and does not delegate to ranking collectors. Instead it puts the BitSet on the request context. 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints the entire result as a binary stream. A header is provided at the beginning of the stream so external clients can self configure. Note: These two components will be sufficient for a non-distributed environment. For distributed export a new Request handler will need to be developed. After applying the patch and building the dist or example, you can register the components through the following changes to solrconfig.xml Register export contrib libraries: lib dir=../../../dist/ regex=solr-export-\d.*\.jar / Register the export queryParser with the following line: queryParser name=export class=org.apache.solr.export.ExportQParserPlugin/ Register the xbin writer: queryResponseWriter name=xbin class=org.apache.solr.export.BinaryExportWriter/ The following query will perform the export: {code} http://localhost:8983/solr/collection1/select?q=*:*fq={!export}wt=xbinfl=join_i {code} Initial patch supports export of four data-types: 1) Single value trie int, long and float 2) Binary doc values. The numerics are currently exported from the FieldCache and the Binary doc values can be in memory or on disk. Since this is designed to export very large result sets efficiently, stored fields are not used for the export. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
oom in documentation-lint
Is there a recipe to avoid this? -documentation-lint: [echo] checking for broken html... [ivy:cachepath] downloading http://repo1.maven.org/maven2/net/sf/jtidy/jtidy/r938/jtidy-r938.jar ... [ivy:cachepath] .. (244kB) [ivy:cachepath] .. (0kB) [ivy:cachepath] [SUCCESSFUL ] net.sf.jtidy#jtidy;r938!jtidy.jar (383ms) [jtidy] Checking for broken html (such as invalid tags)... BUILD FAILED /Users/benson/asf/lucene-solr/build.xml:57: The following error occurred while executing this line: /Users/benson/asf/lucene-solr/lucene/build.xml:208: The following error occurred while executing this line: /Users/benson/asf/lucene-solr/lucene/build.xml:214: The following error occurred while executing this line: /Users/benson/asf/lucene-solr/lucene/common-build.xml:1851: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2271) at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282) at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125) at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207) at java.io.BufferedWriter.flushBuffer(BufferedWriter.java:129) at java.io.BufferedWriter.write(BufferedWriter.java:230) at java.io.PrintWriter.write(PrintWriter.java:456) at java.io.PrintWriter.write(PrintWriter.java:473) at java.io.PrintWriter.print(PrintWriter.java:603) at java.io.PrintWriter.println(PrintWriter.java:739) at org.w3c.tidy.Report.printMessage(Report.java:754) at org.w3c.tidy.Report.errorSummary(Report.java:1572) at org.w3c.tidy.Tidy.parse(Tidy.java:608) at org.w3c.tidy.Tidy.parse(Tidy.java:263) at org.w3c.tidy.ant.JTidyTask.processFile(JTidyTask.java:457) at org.w3c.tidy.ant.JTidyTask.executeSet(JTidyTask.java:420) at org.w3c.tidy.ant.JTidyTask.execute(JTidyTask.java:364) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106) at org.apache.tools.ant.Task.perform(Task.java:348) at org.apache.tools.ant.taskdefs.Sequential.execute(Sequential.java:68) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) Total time: 3 minutes 35 seconds - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
lucene-solr pull request: LUCENE-5389: more analysis advice.
GitHub user benson-basis opened a pull request: https://github.com/apache/lucene-solr/pull/14 LUCENE-5389: more analysis advice. Before we change the protocol for tokenizer construction, let's get plenty of explanation of the existing one, in case of a 4.7. You can merge this pull request into a Git repository by running: $ git pull https://github.com/benson-basis/lucene-solr lucene-5389-more-analysis-doc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/14.patch commit 1ddc14c97396183ac99fb9ee5a40bdc09b3994c5 Author: Benson Margulies ben...@basistech.com Date: 2014-01-07T22:52:11Z LUCENE-5389: more analysis advice. Before we change the protocol for tokenizer construction, let's get plenty of explanation of the existing one, in case of a 4.7. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5389) Even more doc for construction of TokenStream components
[ https://issues.apache.org/jira/browse/LUCENE-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864825#comment-13864825 ] Benson Margulies commented on LUCENE-5389: -- https://github.com/apache/lucene-solr/pull/14 Even more doc for construction of TokenStream components Key: LUCENE-5389 URL: https://issues.apache.org/jira/browse/LUCENE-5389 Project: Lucene - Core Issue Type: Improvement Reporter: Benson Margulies There are more useful things to tell would-be authors of tokenizers. Let's tell them. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: oom in documentation-lint
The jtidy-macro we use is not very efficient. It just uses the built-in jtidytask. I think this is a real problem, last i checked it seemed impossible to fix without writing a custom task to integrate with jtidy. we could either disable it, or you could try setting a large Xmx in ANT_OPTS as a workaround, but I do think we need to fix or disable this. On Tue, Jan 7, 2014 at 5:51 PM, Benson Margulies bimargul...@gmail.com wrote: Is there a recipe to avoid this? -documentation-lint: [echo] checking for broken html... [ivy:cachepath] downloading http://repo1.maven.org/maven2/net/sf/jtidy/jtidy/r938/jtidy-r938.jar ... [ivy:cachepath] .. (244kB) [ivy:cachepath] .. (0kB) [ivy:cachepath] [SUCCESSFUL ] net.sf.jtidy#jtidy;r938!jtidy.jar (383ms) [jtidy] Checking for broken html (such as invalid tags)... BUILD FAILED /Users/benson/asf/lucene-solr/build.xml:57: The following error occurred while executing this line: /Users/benson/asf/lucene-solr/lucene/build.xml:208: The following error occurred while executing this line: /Users/benson/asf/lucene-solr/lucene/build.xml:214: The following error occurred while executing this line: /Users/benson/asf/lucene-solr/lucene/common-build.xml:1851: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2271) at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282) at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125) at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207) at java.io.BufferedWriter.flushBuffer(BufferedWriter.java:129) at java.io.BufferedWriter.write(BufferedWriter.java:230) at java.io.PrintWriter.write(PrintWriter.java:456) at java.io.PrintWriter.write(PrintWriter.java:473) at java.io.PrintWriter.print(PrintWriter.java:603) at java.io.PrintWriter.println(PrintWriter.java:739) at org.w3c.tidy.Report.printMessage(Report.java:754) at org.w3c.tidy.Report.errorSummary(Report.java:1572) at org.w3c.tidy.Tidy.parse(Tidy.java:608) at org.w3c.tidy.Tidy.parse(Tidy.java:263) at org.w3c.tidy.ant.JTidyTask.processFile(JTidyTask.java:457) at org.w3c.tidy.ant.JTidyTask.executeSet(JTidyTask.java:420) at org.w3c.tidy.ant.JTidyTask.execute(JTidyTask.java:364) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106) at org.apache.tools.ant.Task.perform(Task.java:348) at org.apache.tools.ant.taskdefs.Sequential.execute(Sequential.java:68) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) Total time: 3 minutes 35 seconds - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5477) Async execution of OverseerCollectionProcessor tasks
[ https://issues.apache.org/jira/browse/SOLR-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864855#comment-13864855 ] Anshum Gupta commented on SOLR-5477: bq. in my experience, when implementing an async callback API like this, it can be handy to require the client to specify the magical... Considering that we have a 1-n relationship between calls made by the client to the OCP and OCP to Cores, we can't really use the client generated id. We would anyways need multiple ids be generated at the OCP-Core call level. Async execution of OverseerCollectionProcessor tasks Key: SOLR-5477 URL: https://issues.apache.org/jira/browse/SOLR-5477 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Anshum Gupta Attachments: SOLR-5477-CoreAdminStatus.patch Typical collection admin commands are long running and it is very common to have the requests get timed out. It is more of a problem if the cluster is very large.Add an option to run these commands asynchronously add an extra param async=true for all collection commands the task is written to ZK and the caller is returned a task id. as separate collection admin command will be added to poll the status of the task command=statusid=7657668909 if id is not passed all running async tasks should be listed A separate queue is created to store in-process tasks . After the tasks are completed the queue entry is removed. OverSeerColectionProcessor will perform these tasks in multiple threads -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 1201 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1201/ Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseG1GC All tests passed Build Log: [...truncated 9939 lines...] [junit4] JVM J0: stderr was not empty, see: /Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp/junit4-J0-20140107_235447_516.syserr [junit4] JVM J0: stderr (verbatim) [junit4] java(208,0x149d18000) malloc: *** error for object 0x149d06ad1: pointer being freed was not allocated [junit4] *** set a breakpoint in malloc_error_break to debug [junit4] JVM J0: EOF [...truncated 1 lines...] [junit4] ERROR: JVM J0 ended with an exception, command line: /Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/bin/java -XX:+UseCompressedOops -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps -Dtests.prefix=tests -Dtests.seed=6B057318ACC0851A -Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random -Dtests.postingsformat=random -Dtests.docvaluesformat=random -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 -Dtests.cleanthreads=perClass -Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. -Djava.io.tmpdir=. -Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp -Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db -Djava.security.manager=org.apache.lucene.util.TestSecurityManager -Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -Djava.awt.headless=true -Djdk.map.althashing.threshold=0 -Dtests.disableHdfs=true -Dfile.encoding=ISO-8859-1 -classpath
[jira] [Created] (SOLR-5618) Reproducible failure from TestFiltering.testRandomFiltering
Hoss Man created SOLR-5618: -- Summary: Reproducible failure from TestFiltering.testRandomFiltering Key: SOLR-5618 URL: https://issues.apache.org/jira/browse/SOLR-5618 Project: Solr Issue Type: Bug Reporter: Hoss Man uwe's jenkins found this in java8... http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9004/consoleText {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestFiltering -Dtests.method=testRandomFiltering -Dtests.seed=C22042E80957AE3E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=ar_LY -Dtests.timezone=Asia/Katmandu -Dtests.file.encoding=UTF-8 [junit4] FAILURE 16.9s J1 | TestFiltering.testRandomFiltering [junit4] Throwable #1: java.lang.AssertionError: FAILURE: iiter=11 qiter=336 request=[q, {!frange v=val_i l=0 u=1 cost=139 tag=t}, fq, {!frange v=val_i l=0 u=1}, fq, {! cost=92}-_query_:{!frange v=val_i l=1 u=1}, fq, {!frange v=val_i l=0 u=1 cache=true tag=t}, fq, {! cache=true tag=t}-_query_:{!frange v=val_i l=1 u=1}] [junit4]at __randomizedtesting.SeedInfo.seed([C22042E80957AE3E:DD43E12DEC70EE37]:0) [junit4]at org.apache.solr.search.TestFiltering.testRandomFiltering(TestFiltering.java:327) {noformat} The seed fails consistently for me on trunk using java7, and on 4x using both java7 and java6 - details to follow in comment. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5618) Reproducible failure from TestFiltering.testRandomFiltering
[ https://issues.apache.org/jira/browse/SOLR-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864911#comment-13864911 ] Hoss Man commented on SOLR-5618: Relevant log snipper from jenkins... {noformat} [junit4] 2 558586 T3202 C2360 oasc.SolrCore.execute [collection1] webapp=null path=null params={q={!frange+v%3Dval_i+l%3D0+u%3D1+cost%3D139+tag%3Dt}fq={!frange+v%3Dval_i+l%3D0+u%3D1}fq={!+cost%3D92}-_query_:{!frange+v%3Dval_i+l%3D1+u%3D1}fq={!frange+v%3Dval_i+l%3D0+u%3D1+cache%3Dtrue+tag%3Dt}fq={!+cache%3Dtrue+tag%3Dt}-_query_:{!frange+v%3Dval_i+l%3D1+u%3D1}} hits=0 status=0 QTime=1 [junit4] 2 558586 T3202 oas.SolrTestCaseJ4.assertJQ ERROR query failed JSON validation. error=mismatch: '1'!='0' @ response/numFound [junit4] 2 expected =/response/numFound==1 [junit4] 2 response = { [junit4] 2 responseHeader:{ [junit4] 2status:0, [junit4] 2QTime:1}, [junit4] 2 response:{numFound:0,start:0,docs:[] [junit4] 2 }} [junit4] 2 [junit4] 2 request = q={!frange+v%3Dval_i+l%3D0+u%3D1+cost%3D139+tag%3Dt}fq={!frange+v%3Dval_i+l%3D0+u%3D1}fq={!+cost%3D92}-_query_:{!frange+v%3Dval_i+l%3D1+u%3D1}fq={!frange+v%3Dval_i+l%3D0+u%3D1+cache%3Dtrue+tag%3Dt}fq={!+cache%3Dtrue+tag%3Dt}-_query_:{!frange+v%3Dval_i+l%3D1+u%3D1} [junit4] 2 558587 T3202 oasc.SolrException.log ERROR java.lang.RuntimeException: mismatch: '1'!='0' @ response/numFound [junit4] 2at org.apache.solr.SolrTestCaseJ4.assertJQ(SolrTestCaseJ4.java:732) [junit4] 2at org.apache.solr.SolrTestCaseJ4.assertJQ(SolrTestCaseJ4.java:679) [junit4] 2at org.apache.solr.search.TestFiltering.testRandomFiltering(TestFiltering.java:316) ... [junit4] 2 558588 T3202 oass.TestFiltering.testRandomFiltering ERROR FAILURE: iiter=11 qiter=336 request=[q, {!frange v=val_i l=0 u=1 cost=139 tag=t}, fq, {!frange v=val_i l=0 u=1}, fq, {! cost=92}-_query_:{!frange v=val_i l=1 u=1}, fq, {!frange v=val_i l=0 u=1 cache=true tag=t}, fq, {! cache=true tag=t}-_query_:{!frange v=val_i l=1 u=1}] [junit4] 2 558588 T3202 oas.SolrTestCaseJ4.tearDown ###Ending testRandomFiltering [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestFiltering -Dtests.method=testRandomFiltering -Dtests.seed=C22042E80957AE3E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=ar_LY -Dtests.timezone=Asia/Katmandu -Dtests.file.encoding=UTF-8 [junit4] FAILURE 16.9s J1 | TestFiltering.testRandomFiltering [junit4] Throwable #1: java.lang.AssertionError: FAILURE: iiter=11 qiter=336 request=[q, {!frange v=val_i l=0 u=1 cost=139 tag=t}, fq, {!frange v=val_i l=0 u=1}, fq, {! cost=92}-_query_:{!frange v=val_i l=1 u=1}, fq, {!frange v=val_i l=0 u=1 cache=true tag=t}, fq, {! cache=true tag=t}-_query_:{!frange v=val_i l=1 u=1}] [junit4]at __randomizedtesting.SeedInfo.seed([C22042E80957AE3E:DD43E12DEC70EE37]:0) [junit4]at org.apache.solr.search.TestFiltering.testRandomFiltering(TestFiltering.java:327) [junit4]at java.lang.Thread.run(Thread.java:744) {noformat} {noformat} Reproducible failure from TestFiltering.testRandomFiltering --- Key: SOLR-5618 URL: https://issues.apache.org/jira/browse/SOLR-5618 Project: Solr Issue Type: Bug Reporter: Hoss Man uwe's jenkins found this in java8... http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9004/consoleText {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestFiltering -Dtests.method=testRandomFiltering -Dtests.seed=C22042E80957AE3E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=ar_LY -Dtests.timezone=Asia/Katmandu -Dtests.file.encoding=UTF-8 [junit4] FAILURE 16.9s J1 | TestFiltering.testRandomFiltering [junit4] Throwable #1: java.lang.AssertionError: FAILURE: iiter=11 qiter=336 request=[q, {!frange v=val_i l=0 u=1 cost=139 tag=t}, fq, {!frange v=val_i l=0 u=1}, fq, {! cost=92}-_query_:{!frange v=val_i l=1 u=1}, fq, {!frange v=val_i l=0 u=1 cache=true tag=t}, fq, {! cache=true tag=t}-_query_:{!frange v=val_i l=1 u=1}] [junit4] at __randomizedtesting.SeedInfo.seed([C22042E80957AE3E:DD43E12DEC70EE37]:0) [junit4] at org.apache.solr.search.TestFiltering.testRandomFiltering(TestFiltering.java:327) {noformat} The seed fails consistently for me on trunk using java7, and on 4x using both java7 and java6 - details to follow in comment. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: The Old Git Discussion
+1, Mark. Git isn't perfect; I sympathize with the annoyances pointed out by Rob et. all. But I think we would be better off for it -- a net win considering the upsides. In the end I'd love to track changes via branches (which includes forks people make to add changes), not with attaching patch files to an issue tracker. The way we do things here sucks for collaboration and it's a higher bar for people to get involved than it can and should be. ~ David Mark Miller-3 wrote I don’t really buy the fad argument, but as I’ve said, I’m willing to wait a little longer for others to catch on. I try and follow the stats and reports and articles on this pretty closely. As I mentioned early in the thread, by all appearances, the shift from SVN to GIT looks much like the shift from CVS to SVN. This was not a fad change, nor is the next mass movement likely to be. Just like no one starts a project on CVS anymore, we are almost already to the point where new projects start exclusive on GIT - especially open source. I’m happy to sit back and watch the trend continue though. The number of GIT users in the committee and among the committers only grows every time the discussion comes up. If this was 2009, 2010, 2011 … who knows, perhaps I would buy some fad argument. But it just doesn’t jive in 2014. - Mark - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/The-Old-Git-Discussion-tp4109193p4110109.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5618) Reproducible failure from TestFiltering.testRandomFiltering
[ https://issues.apache.org/jira/browse/SOLR-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-5618: --- Attachment: SOLR-5618.patch This smells like a caching related bug ... but i have no idea why/where. The test does multiple iterations where in each iteration it builds an index of a random number of documents, each containing an incremented value for id and val_i -- the number of documents can range from 1 to 21, with the id and val_i fields starting at 0. Then it generates a bunch of random requests consisting of random q and fq params. This is what the failing request looks like... {noformat} q = {!frange v=val_i l=0 u=1 cost=139 tag=t} fq = {!frange v=val_i l=0 u=1} fq = {! cost=92}-_query_:{!frange v=val_i l=1 u=1} fq = {!frange v=val_i l=0 u=1 cache=true tag=t} fq = {! cache=true tag=t}-_query_:{!frange v=val_i l=1 u=1} {noformat} So basically: it will only ever match docs which have val_i==0 -- which given how the index is built means it should always match exactly 1 document: the 0th doc -- but in the failure message we can see that it doens't match any docs. (FWIW: adding some debugging indicates that in the iteration where this fails, the index only has 2 documents in it -- doc#0 and doc#1) In this patch i'm attaching, I hacked the test to explicitly attempt the above query in every iteration, regardless of the num docs in the index, immediately after building the index -- and that new assertion never fails. but then after it passes, it continues on with the existing logic, to generating a bunch of random requests and executing them -- and when it randomly generates the same query as above (that already succeeded in matching 1 doc against the current index) that query then fails to match any docs. which smells to me like some sort of filter caching glitch .. right? Reproducible failure from TestFiltering.testRandomFiltering --- Key: SOLR-5618 URL: https://issues.apache.org/jira/browse/SOLR-5618 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: SOLR-5618.patch uwe's jenkins found this in java8... http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9004/consoleText {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestFiltering -Dtests.method=testRandomFiltering -Dtests.seed=C22042E80957AE3E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=ar_LY -Dtests.timezone=Asia/Katmandu -Dtests.file.encoding=UTF-8 [junit4] FAILURE 16.9s J1 | TestFiltering.testRandomFiltering [junit4] Throwable #1: java.lang.AssertionError: FAILURE: iiter=11 qiter=336 request=[q, {!frange v=val_i l=0 u=1 cost=139 tag=t}, fq, {!frange v=val_i l=0 u=1}, fq, {! cost=92}-_query_:{!frange v=val_i l=1 u=1}, fq, {!frange v=val_i l=0 u=1 cache=true tag=t}, fq, {! cache=true tag=t}-_query_:{!frange v=val_i l=1 u=1}] [junit4] at __randomizedtesting.SeedInfo.seed([C22042E80957AE3E:DD43E12DEC70EE37]:0) [junit4] at org.apache.solr.search.TestFiltering.testRandomFiltering(TestFiltering.java:327) {noformat} The seed fails consistently for me on trunk using java7, and on 4x using both java7 and java6 - details to follow in comment. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2553) Nested Field Collapsing
[ https://issues.apache.org/jira/browse/SOLR-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865012#comment-13865012 ] Kranti Parisa commented on SOLR-2553: - I think we will also need to support other grouping params especially group.limit. so that user can restrict the results even with Nested Groups Nested Field Collapsing --- Key: SOLR-2553 URL: https://issues.apache.org/jira/browse/SOLR-2553 Project: Solr Issue Type: Improvement Components: SearchComponents - other Reporter: Martijn Laarman Currently specifying grouping on multiple fields returns multiple datasets. It would be nice if Solr supported cascading / nested grouping by applying the first group over the entire result set, the next over each group and so forth and so forth. Even if limited to supporting nesting grouping 2 levels deep would cover alot of use cases. group.field=locationgroup.field=type -Location X ---Type 1 -documents ---Type 2 documents -Location Y ---Type 1 documents ---Type 2 documents instead of -Location X -- documents -Location Y --documents -Type 1 --documents -Type2 --documents -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5560) Enable LocalParams without escaping the query
[ https://issues.apache.org/jira/browse/SOLR-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865053#comment-13865053 ] Ryan Cutter commented on SOLR-5560: --- I don't know, I assume a committer familiar with this area will take a look in the near future. I see other unassigned tickets with patches attached so I'm sure there's a process. Enable LocalParams without escaping the query - Key: SOLR-5560 URL: https://issues.apache.org/jira/browse/SOLR-5560 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.6 Reporter: Isaac Hebsh Fix For: 4.7, 4.6.1 Attachments: SOLR-5560.patch This query should be a legit syntax: http://localhost:8983/solr/collection1/select?debugQuery=truedefType=lucenedf=idq=TERM1 AND {!lucene df=text}(TERM2 TERM3 TERM4 TERM5) currently it isn't, because the LocalParams can be specified on a single term only. [~billnbell] thinks it is a bug. From the mailing list: {quote} We want to set a LocalParam on a nested query. When quering with v inline parameter, it works fine: http://localhost:8983/solr/collection1/select?debugQuery=truedefType=lucenedf=idq=TERM1 AND {!lucene df=text v=TERM2 TERM3 \TERM4 TERM5\} the parsedquery_toString is +id:TERM1 +(text:term2 text:term3 text:term4 term5) Query using the _query_ also works fine: http://localhost:8983/solr/collection1/select?debugQuery=truedefType=lucenedf=idq=TERM1 AND _query_:{!lucene df=text}TERM2 TERM3 \TERM4 TERM5\ (parsedquery is exactly the same). Obviously, there is the option of external parameter ({... v=$nestedq}nestedq=...) This is a good solution, but it is not practical, when having a lot of such nested queries. BUT, when trying to put the nested query in place, it yields syntax error: http://localhost:8983/solr/collection1/select?debugQuery=truedefType=lucenedf=idq=TERM1 AND {!lucene df=text}(TERM2 TERM3 TERM4 TERM5) org.apache.solr.search.SyntaxError: Cannot parse '(TERM2' The previous options are less preferred, because the escaping that should be made on the nested query. {quote} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5610) Support cluster-wide properties with an API called CLUSTERPROP
[ https://issues.apache.org/jira/browse/SOLR-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-5610: - Description: Add a collection admin API for cluster wide property management the new API would create an entry in the root as /cluster-props.json {code:javascript} { prop:val } {code} The API would work as /command=clusterpropname=propNamevalue=propVal there will be a set of well-known properties which can be set or unset with this command was: Add a collection admin API for cluster wide property management the new API would create an entry in the root as /cluster-props.json {code:javascipt} { prop:val } The API would work as /command=clusterpropname=propNamevalue=propVal there will be a set of well-known properties which can be set or unset with this command Support cluster-wide properties with an API called CLUSTERPROP -- Key: SOLR-5610 URL: https://issues.apache.org/jira/browse/SOLR-5610 Project: Solr Issue Type: Bug Reporter: Noble Paul Add a collection admin API for cluster wide property management the new API would create an entry in the root as /cluster-props.json {code:javascript} { prop:val } {code} The API would work as /command=clusterpropname=propNamevalue=propVal there will be a set of well-known properties which can be set or unset with this command -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org