[jira] [Commented] (SOLR-9530) Add an Atomic Update Processor
[ https://issues.apache.org/jira/browse/SOLR-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882167#comment-15882167 ] Hamso commented on SOLR-9530: - I have a question. What is if we have: doc1 {noformat} {id: 1, Street: xyz} {noformat} and doc2 {noformat} {id: 1, firstName: Tom, lastName: Cruise} {id: 1, firstName: Max, lastName: Mueller} {noformat} How will the final doc look like? > Add an Atomic Update Processor > --- > > Key: SOLR-9530 > URL: https://issues.apache.org/jira/browse/SOLR-9530 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Varun Thacker > Attachments: SOLR-9530.patch, SOLR-9530.patch, SOLR-9530.patch > > > I'd like to explore the idea of adding a new update processor to help ingest > partial updates. > Example use-case - There are two datasets with a common id field. How can I > merge both of them at index time? > Proposed Solution: > {code} > > > add > > > > > {code} > So the first JSON dump could be ingested against > {{http://localhost:8983/solr/gettingstarted/update/json}} > And then the second JSON could be ingested against > {{http://localhost:8983/solr/gettingstarted/update/json?processor=atomic}} > The Atomic Update Processor could support all the atomic update operations > currently supported. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882136#comment-15882136 ] Simon Willnauer commented on LUCENE-7707: - [~jpountz] are your concerns addressed with the latest patch? > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch, > LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9530) Add an Atomic Update Processor
[ https://issues.apache.org/jira/browse/SOLR-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882079#comment-15882079 ] Noble Paul commented on SOLR-9530: -- I'm not sure if the second request must fail imagine a doc already exists with {code} {id: 1} {code} subsequently the user sends 2 parallel requests {code} {id:1, firstName: Tom} {id:1, lastName: Cruise} {code} after these two operations are performed the final doc should be {code} {id:1, firstName:Tom, lastName:Cruise} {code} The system should handle race conditions gracefully The URP can fetch the {{\_version_}} before sending the appropriate atomic operation using optimistic concurrency. if the request fails , it can reload the {{\_version_}} and retry > Add an Atomic Update Processor > --- > > Key: SOLR-9530 > URL: https://issues.apache.org/jira/browse/SOLR-9530 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Varun Thacker > Attachments: SOLR-9530.patch, SOLR-9530.patch, SOLR-9530.patch > > > I'd like to explore the idea of adding a new update processor to help ingest > partial updates. > Example use-case - There are two datasets with a common id field. How can I > merge both of them at index time? > Proposed Solution: > {code} > > > add > > > > > {code} > So the first JSON dump could be ingested against > {{http://localhost:8983/solr/gettingstarted/update/json}} > And then the second JSON could be ingested against > {{http://localhost:8983/solr/gettingstarted/update/json?processor=atomic}} > The Atomic Update Processor could support all the atomic update operations > currently supported. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper
[ https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882054#comment-15882054 ] Varun Rajput commented on SOLR-6736: I like the approach of "trusted" configsets, how would the restrictions on vulnerable components be imposed? > A collections-like request handler to manage solr configurations on zookeeper > - > > Key: SOLR-6736 > URL: https://issues.apache.org/jira/browse/SOLR-6736 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Varun Rajput >Assignee: Ishan Chattopadhyaya > Attachments: newzkconf.zip, SOLR-6736-newapi.patch, > SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, > test_pub.der, zkconfighandler.zip, zkconfighandler.zip > > > Managing Solr configuration files on zookeeper becomes cumbersome while using > solr in cloud mode, especially while trying out changes in the > configurations. > It will be great if there is a request handler that can provide an API to > manage the configurations similar to the collections handler that would allow > actions like uploading new configurations, linking them to a collection, > deleting configurations, etc. > example : > {code} > #use the following command to upload a new configset called mynewconf. This > will fail if there is alredy a conf called 'mynewconf'. The file could be a > jar , zip or a tar file which contains all the files for the this conf. > curl -X POST -H 'Content-Type: application/octet-stream' --data-binary > @testconf.zip > http://localhost:8983/solr/admin/configs/mynewconf?sig= > {code} > A GET to http://localhost:8983/solr/admin/configs will give a list of configs > available > A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the > list of files in mynewconf -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper
[ https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881970#comment-15881970 ] Ishan Chattopadhyaya edited comment on SOLR-6736 at 2/24/17 5:08 AM: - bq. So this would not affect the setups which have security enabled. Right. bq. If this endpoint is secured using authorization and authentication, then we can store the uploaded configsets with "trusted=true". Those "trusted" configsets can be used to create collections without any restrictions. was (Author: ichattopadhyaya): Right. bq. If this endpoint is secured using authorization and authentication, then we can store the uploaded configsets with "trusted=true". Those "trusted" configsets can be used to create collections without any restrictions. > A collections-like request handler to manage solr configurations on zookeeper > - > > Key: SOLR-6736 > URL: https://issues.apache.org/jira/browse/SOLR-6736 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Varun Rajput >Assignee: Ishan Chattopadhyaya > Attachments: newzkconf.zip, SOLR-6736-newapi.patch, > SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, > test_pub.der, zkconfighandler.zip, zkconfighandler.zip > > > Managing Solr configuration files on zookeeper becomes cumbersome while using > solr in cloud mode, especially while trying out changes in the > configurations. > It will be great if there is a request handler that can provide an API to > manage the configurations similar to the collections handler that would allow > actions like uploading new configurations, linking them to a collection, > deleting configurations, etc. > example : > {code} > #use the following command to upload a new configset called mynewconf. This > will fail if there is alredy a conf called 'mynewconf'. The file could be a > jar , zip or a tar file which contains all the files for the this conf. > curl -X POST -H 'Content-Type: application/octet-stream' --data-binary > @testconf.zip > http://localhost:8983/solr/admin/configs/mynewconf?sig= > {code} > A GET to http://localhost:8983/solr/admin/configs will give a list of configs > available > A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the > list of files in mynewconf -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper
[ https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881970#comment-15881970 ] Ishan Chattopadhyaya commented on SOLR-6736: Right. bq. If this endpoint is secured using authorization and authentication, then we can store the uploaded configsets with "trusted=true". Those "trusted" configsets can be used to create collections without any restrictions. > A collections-like request handler to manage solr configurations on zookeeper > - > > Key: SOLR-6736 > URL: https://issues.apache.org/jira/browse/SOLR-6736 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Varun Rajput >Assignee: Ishan Chattopadhyaya > Attachments: newzkconf.zip, SOLR-6736-newapi.patch, > SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, > test_pub.der, zkconfighandler.zip, zkconfighandler.zip > > > Managing Solr configuration files on zookeeper becomes cumbersome while using > solr in cloud mode, especially while trying out changes in the > configurations. > It will be great if there is a request handler that can provide an API to > manage the configurations similar to the collections handler that would allow > actions like uploading new configurations, linking them to a collection, > deleting configurations, etc. > example : > {code} > #use the following command to upload a new configset called mynewconf. This > will fail if there is alredy a conf called 'mynewconf'. The file could be a > jar , zip or a tar file which contains all the files for the this conf. > curl -X POST -H 'Content-Type: application/octet-stream' --data-binary > @testconf.zip > http://localhost:8983/solr/admin/configs/mynewconf?sig= > {code} > A GET to http://localhost:8983/solr/admin/configs will give a list of configs > available > A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the > list of files in mynewconf -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper
[ https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881959#comment-15881959 ] Hrishikesh Gadre commented on SOLR-6736: [~ichattopadhyaya] bq. This seems like a sound approach in theory, but often times users don't follow proper procedures for deployment and end up exposing their deployments without proper authentication/authorization. This extra security is to save such users from potential remote code execution based attacks. Our guidelines should, anyway, be for admins to enable security before going to production. Oh I think I misunderstood your earlier statement. So this would not affect the setups which have security enabled. > A collections-like request handler to manage solr configurations on zookeeper > - > > Key: SOLR-6736 > URL: https://issues.apache.org/jira/browse/SOLR-6736 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Varun Rajput >Assignee: Ishan Chattopadhyaya > Attachments: newzkconf.zip, SOLR-6736-newapi.patch, > SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, > test_pub.der, zkconfighandler.zip, zkconfighandler.zip > > > Managing Solr configuration files on zookeeper becomes cumbersome while using > solr in cloud mode, especially while trying out changes in the > configurations. > It will be great if there is a request handler that can provide an API to > manage the configurations similar to the collections handler that would allow > actions like uploading new configurations, linking them to a collection, > deleting configurations, etc. > example : > {code} > #use the following command to upload a new configset called mynewconf. This > will fail if there is alredy a conf called 'mynewconf'. The file could be a > jar , zip or a tar file which contains all the files for the this conf. > curl -X POST -H 'Content-Type: application/octet-stream' --data-binary > @testconf.zip > http://localhost:8983/solr/admin/configs/mynewconf?sig= > {code} > A GET to http://localhost:8983/solr/admin/configs will give a list of configs > available > A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the > list of files in mynewconf -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8440) Script support for enabling basic auth
[ https://issues.apache.org/jira/browse/SOLR-8440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881944#comment-15881944 ] Ishan Chattopadhyaya commented on SOLR-8440: Agreed. > Script support for enabling basic auth > -- > > Key: SOLR-8440 > URL: https://issues.apache.org/jira/browse/SOLR-8440 > Project: Solr > Issue Type: New Feature > Components: scripts and tools >Reporter: Jan Høydahl >Assignee: Ishan Chattopadhyaya > Labels: authentication, security > > Now that BasicAuthPlugin will be able to work without an AuthorizationPlugin > (SOLR-8429), it would be sweet to provide a super simple way to "Password > protect Solr"™ right from the command line: > {noformat} > bin/solr basicAuth -adduser -user solr -pass SolrRocks > {noformat} > It would take the mystery out of enabling one single password across the > board. The command would do something like this > # Check if HTTPS is enabled, and if not, print a friendly warning > # Check if {{/security.json}} already exists > ## NO => create one with only plugin class defined > ## YES => Abort if exists but plugin is not {{BasicAuthPlugin}} > # Using security REST API, add the new user -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8440) Script support for enabling basic auth
[ https://issues.apache.org/jira/browse/SOLR-8440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881936#comment-15881936 ] Noble Paul commented on SOLR-8440: -- I guess we must enable {{RulebasedAuthorization}} as well. ensure that {{collection-admin-edit}}, {{core-admin-edit}},{{security-edit}} are protected > Script support for enabling basic auth > -- > > Key: SOLR-8440 > URL: https://issues.apache.org/jira/browse/SOLR-8440 > Project: Solr > Issue Type: New Feature > Components: scripts and tools >Reporter: Jan Høydahl >Assignee: Ishan Chattopadhyaya > Labels: authentication, security > > Now that BasicAuthPlugin will be able to work without an AuthorizationPlugin > (SOLR-8429), it would be sweet to provide a super simple way to "Password > protect Solr"™ right from the command line: > {noformat} > bin/solr basicAuth -adduser -user solr -pass SolrRocks > {noformat} > It would take the mystery out of enabling one single password across the > board. The command would do something like this > # Check if HTTPS is enabled, and if not, print a friendly warning > # Check if {{/security.json}} already exists > ## NO => create one with only plugin class defined > ## YES => Abort if exists but plugin is not {{BasicAuthPlugin}} > # Using security REST API, add the new user -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper
[ https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881915#comment-15881915 ] Ishan Chattopadhyaya edited comment on SOLR-6736 at 2/24/17 4:35 AM: - {quote} If it is to simplify the development process, then that can be mitigated by setting up a unsecure Solr cluster in the staging environment. In case of a production environment all endpoints must be authenticated using the configured mechanism if security is enabled. {quote} This seems like a sound approach in theory, but often times users don't follow proper procedures for deployment and end up exposing their deployments without proper authentication/authorization. This extra security is to save such users from potential remote code execution based attacks. Our guidelines should, anyway, be for admins to enable security before going to production. Having this feature disabled out of the box was the other alternative that was explored above (to protect users who might end up exposing their cluster without securing it first), but I think it is inconvenient and can (and should) be avoided. was (Author: ichattopadhyaya): This seems like a sound approach in theory, but often times users don't follow proper procedures for deployment and end up exposing their deployments without proper authentication/authorization. This extra security is to save such users from potential remote code execution based attacks. Our guidelines should, anyway, be for admins to enable security before going to production. Having this feature disabled out of the box was the other alternative that was explored above (to protect users who might end up exposing their cluster without securing it first), but I think it is inconvenient and can (and should) be avoided. > A collections-like request handler to manage solr configurations on zookeeper > - > > Key: SOLR-6736 > URL: https://issues.apache.org/jira/browse/SOLR-6736 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Varun Rajput >Assignee: Ishan Chattopadhyaya > Attachments: newzkconf.zip, SOLR-6736-newapi.patch, > SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, > test_pub.der, zkconfighandler.zip, zkconfighandler.zip > > > Managing Solr configuration files on zookeeper becomes cumbersome while using > solr in cloud mode, especially while trying out changes in the > configurations. > It will be great if there is a request handler that can provide an API to > manage the configurations similar to the collections handler that would allow > actions like uploading new configurations, linking them to a collection, > deleting configurations, etc. > example : > {code} > #use the following command to upload a new configset called mynewconf. This > will fail if there is alredy a conf called 'mynewconf'. The file could be a > jar , zip or a tar file which contains all the files for the this conf. > curl -X POST -H 'Content-Type: application/octet-stream' --data-binary > @testconf.zip > http://localhost:8983/solr/admin/configs/mynewconf?sig= > {code} > A GET to http://localhost:8983/solr/admin/configs will give a list of configs > available > A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the > list of files in mynewconf -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8440) Script support for enabling basic auth
[ https://issues.apache.org/jira/browse/SOLR-8440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881920#comment-15881920 ] Ishan Chattopadhyaya commented on SOLR-8440: I'm planning to work on this soon. > Script support for enabling basic auth > -- > > Key: SOLR-8440 > URL: https://issues.apache.org/jira/browse/SOLR-8440 > Project: Solr > Issue Type: New Feature > Components: scripts and tools >Reporter: Jan Høydahl >Assignee: Jan Høydahl > Labels: authentication, security > > Now that BasicAuthPlugin will be able to work without an AuthorizationPlugin > (SOLR-8429), it would be sweet to provide a super simple way to "Password > protect Solr"™ right from the command line: > {noformat} > bin/solr basicAuth -adduser -user solr -pass SolrRocks > {noformat} > It would take the mystery out of enabling one single password across the > board. The command would do something like this > # Check if HTTPS is enabled, and if not, print a friendly warning > # Check if {{/security.json}} already exists > ## NO => create one with only plugin class defined > ## YES => Abort if exists but plugin is not {{BasicAuthPlugin}} > # Using security REST API, add the new user -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-8440) Script support for enabling basic auth
[ https://issues.apache.org/jira/browse/SOLR-8440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya reassigned SOLR-8440: -- Assignee: Ishan Chattopadhyaya (was: Jan Høydahl) > Script support for enabling basic auth > -- > > Key: SOLR-8440 > URL: https://issues.apache.org/jira/browse/SOLR-8440 > Project: Solr > Issue Type: New Feature > Components: scripts and tools >Reporter: Jan Høydahl >Assignee: Ishan Chattopadhyaya > Labels: authentication, security > > Now that BasicAuthPlugin will be able to work without an AuthorizationPlugin > (SOLR-8429), it would be sweet to provide a super simple way to "Password > protect Solr"™ right from the command line: > {noformat} > bin/solr basicAuth -adduser -user solr -pass SolrRocks > {noformat} > It would take the mystery out of enabling one single password across the > board. The command would do something like this > # Check if HTTPS is enabled, and if not, print a friendly warning > # Check if {{/security.json}} already exists > ## NO => create one with only plugin class defined > ## YES => Abort if exists but plugin is not {{BasicAuthPlugin}} > # Using security REST API, add the new user -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper
[ https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881919#comment-15881919 ] Noble Paul commented on SOLR-6736: -- don't disable {{DataImportHandler}}, just disable {{ScriptTransformer}} > A collections-like request handler to manage solr configurations on zookeeper > - > > Key: SOLR-6736 > URL: https://issues.apache.org/jira/browse/SOLR-6736 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Varun Rajput >Assignee: Ishan Chattopadhyaya > Attachments: newzkconf.zip, SOLR-6736-newapi.patch, > SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, > test_pub.der, zkconfighandler.zip, zkconfighandler.zip > > > Managing Solr configuration files on zookeeper becomes cumbersome while using > solr in cloud mode, especially while trying out changes in the > configurations. > It will be great if there is a request handler that can provide an API to > manage the configurations similar to the collections handler that would allow > actions like uploading new configurations, linking them to a collection, > deleting configurations, etc. > example : > {code} > #use the following command to upload a new configset called mynewconf. This > will fail if there is alredy a conf called 'mynewconf'. The file could be a > jar , zip or a tar file which contains all the files for the this conf. > curl -X POST -H 'Content-Type: application/octet-stream' --data-binary > @testconf.zip > http://localhost:8983/solr/admin/configs/mynewconf?sig= > {code} > A GET to http://localhost:8983/solr/admin/configs will give a list of configs > available > A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the > list of files in mynewconf -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper
[ https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881915#comment-15881915 ] Ishan Chattopadhyaya commented on SOLR-6736: This seems like a sound approach in theory, but often times users don't follow proper procedures for deployment and end up exposing their deployments without proper authentication/authorization. This extra security is to save such users from potential remote code execution based attacks. Our guidelines should, anyway, be for admins to enable security before going to production. Having this feature disabled out of the box was the other alternative that was explored above (to protect users who might end up exposing their cluster without securing it first), but I think it is inconvenient and can (and should) be avoided. > A collections-like request handler to manage solr configurations on zookeeper > - > > Key: SOLR-6736 > URL: https://issues.apache.org/jira/browse/SOLR-6736 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Varun Rajput >Assignee: Ishan Chattopadhyaya > Attachments: newzkconf.zip, SOLR-6736-newapi.patch, > SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, > test_pub.der, zkconfighandler.zip, zkconfighandler.zip > > > Managing Solr configuration files on zookeeper becomes cumbersome while using > solr in cloud mode, especially while trying out changes in the > configurations. > It will be great if there is a request handler that can provide an API to > manage the configurations similar to the collections handler that would allow > actions like uploading new configurations, linking them to a collection, > deleting configurations, etc. > example : > {code} > #use the following command to upload a new configset called mynewconf. This > will fail if there is alredy a conf called 'mynewconf'. The file could be a > jar , zip or a tar file which contains all the files for the this conf. > curl -X POST -H 'Content-Type: application/octet-stream' --data-binary > @testconf.zip > http://localhost:8983/solr/admin/configs/mynewconf?sig= > {code} > A GET to http://localhost:8983/solr/admin/configs will give a list of configs > available > A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the > list of files in mynewconf -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper
[ https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881903#comment-15881903 ] Hrishikesh Gadre edited comment on SOLR-6736 at 2/24/17 4:10 AM: - [~ichattopadhyaya] bq. We can allow unauthenticated/unauthorized users to upload a configset, I am not following why this endpoint needs to be "unsecure" ? If it is to simplify the development process, then that can be mitigated by setting up a unsecure Solr cluster in the staging environment. In case of a production environment all endpoints must be authenticated using the configured mechanism if security is enabled. This request handler should also implement PermissionNameProvider interface so that only users which have "CONFIG_EDIT_PERM" can update it. was (Author: hgadre): [~ichattopadhyaya] bq. We can allow unauthenticated/unauthorized users to upload a configset, I am not following why this endpoint needs to be "unsecure" ? If it is to simplify the development process, then that can be mitigated by setting up a unsecure Solr cluster in the staging environment. In case of a production environment - all endpoints must be authenticated using the mechanism configured in the security.json. This request handler should also implement PermissionNameProvider interface so that only users which have "CONFIG_EDIT_PERM" can update it. > A collections-like request handler to manage solr configurations on zookeeper > - > > Key: SOLR-6736 > URL: https://issues.apache.org/jira/browse/SOLR-6736 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Varun Rajput >Assignee: Ishan Chattopadhyaya > Attachments: newzkconf.zip, SOLR-6736-newapi.patch, > SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, > test_pub.der, zkconfighandler.zip, zkconfighandler.zip > > > Managing Solr configuration files on zookeeper becomes cumbersome while using > solr in cloud mode, especially while trying out changes in the > configurations. > It will be great if there is a request handler that can provide an API to > manage the configurations similar to the collections handler that would allow > actions like uploading new configurations, linking them to a collection, > deleting configurations, etc. > example : > {code} > #use the following command to upload a new configset called mynewconf. This > will fail if there is alredy a conf called 'mynewconf'. The file could be a > jar , zip or a tar file which contains all the files for the this conf. > curl -X POST -H 'Content-Type: application/octet-stream' --data-binary > @testconf.zip > http://localhost:8983/solr/admin/configs/mynewconf?sig= > {code} > A GET to http://localhost:8983/solr/admin/configs will give a list of configs > available > A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the > list of files in mynewconf -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper
[ https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881903#comment-15881903 ] Hrishikesh Gadre edited comment on SOLR-6736 at 2/24/17 4:09 AM: - [~ichattopadhyaya] bq. We can allow unauthenticated/unauthorized users to upload a configset, I am not following why this endpoint needs to be "unsecure" ? If it is to simplify the development process, then that can be mitigated by setting up a unsecure Solr cluster in the staging environment. In case of a production environment - all endpoints must be authenticated using the mechanism configured in the security.json. This request handler should also implement PermissionNameProvider interface so that only users which have "CONFIG_EDIT_PERM" can update it. was (Author: hgadre): [~ichattopadhyaya] bq. We can allow unauthenticated/unauthorized users to upload a configset, I am not following why this endpoint needs to be "unsecure" ? If it is to simplify the development process, then that can be mitigated by setting up a dev Solr cluster in the staging environment. In case of a production environment - all endpoints must be authenticated using the mechanism configured in the security.json. This request handler should also implement PermissionNameProvider interface so that only users which have "CONFIG_EDIT_PERM" can update it. > A collections-like request handler to manage solr configurations on zookeeper > - > > Key: SOLR-6736 > URL: https://issues.apache.org/jira/browse/SOLR-6736 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Varun Rajput >Assignee: Ishan Chattopadhyaya > Attachments: newzkconf.zip, SOLR-6736-newapi.patch, > SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, > test_pub.der, zkconfighandler.zip, zkconfighandler.zip > > > Managing Solr configuration files on zookeeper becomes cumbersome while using > solr in cloud mode, especially while trying out changes in the > configurations. > It will be great if there is a request handler that can provide an API to > manage the configurations similar to the collections handler that would allow > actions like uploading new configurations, linking them to a collection, > deleting configurations, etc. > example : > {code} > #use the following command to upload a new configset called mynewconf. This > will fail if there is alredy a conf called 'mynewconf'. The file could be a > jar , zip or a tar file which contains all the files for the this conf. > curl -X POST -H 'Content-Type: application/octet-stream' --data-binary > @testconf.zip > http://localhost:8983/solr/admin/configs/mynewconf?sig= > {code} > A GET to http://localhost:8983/solr/admin/configs will give a list of configs > available > A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the > list of files in mynewconf -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper
[ https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881903#comment-15881903 ] Hrishikesh Gadre commented on SOLR-6736: [~ichattopadhyaya] bq. We can allow unauthenticated/unauthorized users to upload a configset, I am not following why this endpoint needs to be "unsecure" ? If it is to simplify the development process, then that can be mitigated by setting up a dev Solr cluster in the staging environment. In case of a production environment - all endpoints must be authenticated using the mechanism configured in the security.json. This request handler should also implement PermissionNameProvider interface so that only users which have "CONFIG_EDIT_PERM" can update it. > A collections-like request handler to manage solr configurations on zookeeper > - > > Key: SOLR-6736 > URL: https://issues.apache.org/jira/browse/SOLR-6736 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Varun Rajput >Assignee: Ishan Chattopadhyaya > Attachments: newzkconf.zip, SOLR-6736-newapi.patch, > SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, > test_pub.der, zkconfighandler.zip, zkconfighandler.zip > > > Managing Solr configuration files on zookeeper becomes cumbersome while using > solr in cloud mode, especially while trying out changes in the > configurations. > It will be great if there is a request handler that can provide an API to > manage the configurations similar to the collections handler that would allow > actions like uploading new configurations, linking them to a collection, > deleting configurations, etc. > example : > {code} > #use the following command to upload a new configset called mynewconf. This > will fail if there is alredy a conf called 'mynewconf'. The file could be a > jar , zip or a tar file which contains all the files for the this conf. > curl -X POST -H 'Content-Type: application/octet-stream' --data-binary > @testconf.zip > http://localhost:8983/solr/admin/configs/mynewconf?sig= > {code} > A GET to http://localhost:8983/solr/admin/configs will give a list of configs > available > A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the > list of files in mynewconf -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6203) cast exception while searching with sort function and result grouping
[ https://issues.apache.org/jira/browse/SOLR-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881896#comment-15881896 ] Judith Silverman edited comment on SOLR-6203 at 2/24/17 4:02 AM: - Hi, Christine, are there changes you would like me to make to the patch dated 05Dec16? As I recall, your commit of 23Dec16 took us part of the way toward that patch. Do my concerns about SOLR_9660 (02Dec16 above) make sense? Thanks, Judith was (Author: judith): Hi, Christine, are there changes you would like me to make in the patch dated 05Dec16? As I recall, your commit of 23Dec16 took us part of the way toward that patch. Do my concerns about SOLR_9660 (02Dec16 above) make sense? Thanks, Judith > cast exception while searching with sort function and result grouping > - > > Key: SOLR-6203 > URL: https://issues.apache.org/jira/browse/SOLR-6203 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 4.7, 4.8 >Reporter: Nate Dire >Assignee: Christine Poerschke > Attachments: README, SOLR-6203.patch, SOLR-6203.patch, > SOLR-6203.patch, SOLR-6203.patch, SOLR-6203.patch, SOLR-6203.patch, > SOLR-6203-unittest.patch, SOLR-6203-unittest.patch > > > After upgrading from 4.5.1 to 4.7+, a schema including a {{"*"}} dynamic > field as text gets a cast exception when using a sort function and result > grouping. > Repro (with example config): > # Add {{"*"}} dynamic field as a {{TextField}}, eg: > {noformat} > > {noformat} > # Create sharded collection > {noformat} > curl > 'http://localhost:8983/solr/admin/collections?action=CREATE=test=2=2' > {noformat} > # Add example docs (query must have some results) > # Submit query which sorts on a function result and uses result grouping: > {noformat} > { > "responseHeader": { > "status": 500, > "QTime": 50, > "params": { > "sort": "sqrt(popularity) desc", > "indent": "true", > "q": "*:*", > "_": "1403709010008", > "group.field": "manu", > "group": "true", > "wt": "json" > } > }, > "error": { > "msg": "java.lang.Double cannot be cast to > org.apache.lucene.util.BytesRef", > "code": 500 > } > } > {noformat} > Source exception from log: > {noformat} > ERROR - 2014-06-25 08:10:10.055; org.apache.solr.common.SolrException; > java.lang.ClassCastException: java.lang.Double cannot be cast to > org.apache.lucene.util.BytesRef > at > org.apache.solr.schema.FieldType.marshalStringSortValue(FieldType.java:981) > at org.apache.solr.schema.TextField.marshalSortValue(TextField.java:176) > at > org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.serializeSearchGroup(SearchGroupsResultTransformer.java:125) > at > org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.transform(SearchGroupsResultTransformer.java:65) > at > org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.transform(SearchGroupsResultTransformer.java:43) > at > org.apache.solr.search.grouping.CommandHandler.processResult(CommandHandler.java:193) > at > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:340) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) > ... > {noformat} > It looks like {{serializeSearchGroup}} is matching the sort expression as the > {{"*"}} dynamic field, which is a TextField in the repro. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6203) cast exception while searching with sort function and result grouping
[ https://issues.apache.org/jira/browse/SOLR-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881896#comment-15881896 ] Judith Silverman commented on SOLR-6203: Hi, Christine, are there changes you would like me to make in the patch dated 05Dec16? As I recall, your commit of 23Dec16 took us part of the way toward that patch. Do my concerns about SOLR_9660 (02Dec16 above) make sense? Thanks, Judith > cast exception while searching with sort function and result grouping > - > > Key: SOLR-6203 > URL: https://issues.apache.org/jira/browse/SOLR-6203 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 4.7, 4.8 >Reporter: Nate Dire >Assignee: Christine Poerschke > Attachments: README, SOLR-6203.patch, SOLR-6203.patch, > SOLR-6203.patch, SOLR-6203.patch, SOLR-6203.patch, SOLR-6203.patch, > SOLR-6203-unittest.patch, SOLR-6203-unittest.patch > > > After upgrading from 4.5.1 to 4.7+, a schema including a {{"*"}} dynamic > field as text gets a cast exception when using a sort function and result > grouping. > Repro (with example config): > # Add {{"*"}} dynamic field as a {{TextField}}, eg: > {noformat} > > {noformat} > # Create sharded collection > {noformat} > curl > 'http://localhost:8983/solr/admin/collections?action=CREATE=test=2=2' > {noformat} > # Add example docs (query must have some results) > # Submit query which sorts on a function result and uses result grouping: > {noformat} > { > "responseHeader": { > "status": 500, > "QTime": 50, > "params": { > "sort": "sqrt(popularity) desc", > "indent": "true", > "q": "*:*", > "_": "1403709010008", > "group.field": "manu", > "group": "true", > "wt": "json" > } > }, > "error": { > "msg": "java.lang.Double cannot be cast to > org.apache.lucene.util.BytesRef", > "code": 500 > } > } > {noformat} > Source exception from log: > {noformat} > ERROR - 2014-06-25 08:10:10.055; org.apache.solr.common.SolrException; > java.lang.ClassCastException: java.lang.Double cannot be cast to > org.apache.lucene.util.BytesRef > at > org.apache.solr.schema.FieldType.marshalStringSortValue(FieldType.java:981) > at org.apache.solr.schema.TextField.marshalSortValue(TextField.java:176) > at > org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.serializeSearchGroup(SearchGroupsResultTransformer.java:125) > at > org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.transform(SearchGroupsResultTransformer.java:65) > at > org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.transform(SearchGroupsResultTransformer.java:43) > at > org.apache.solr.search.grouping.CommandHandler.processResult(CommandHandler.java:193) > at > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:340) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) > ... > {noformat} > It looks like {{serializeSearchGroup}} is matching the sort expression as the > {{"*"}} dynamic field, which is a TextField in the repro. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-10200) Streaming Expressions should uses the shards parameter if present
Joel Bernstein created SOLR-10200: - Summary: Streaming Expressions should uses the shards parameter if present Key: SOLR-10200 URL: https://issues.apache.org/jira/browse/SOLR-10200 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Joel Bernstein Currently Streaming Expressions select shards using an internal ZooKeeper client. This ticket will allow stream sources to except a *shards* parameter so that non-SolrCloud deployments can set the shards manually. The shards parameters will be added as http parameters in the following format: collectionA.shards=url1,url1,...=url1,url2... The /stream handler with then add the shards to the StreamContext so all stream sources can check to see if there collection has the shards set manually. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper
[ https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881883#comment-15881883 ] Ishan Chattopadhyaya edited comment on SOLR-6736 at 2/24/17 3:42 AM: - I have created a branch jira/solr-6736 with the latest patch (after updating it for master). Regarding the security vulnerability that this new API exposes, I have the following thoughts to take this forward: # We can allow unauthenticated/unauthorized users to upload a configset, but mark such configsets with a "trusted=false" flag while storing in ZK (metadata on the configset's znode). If this endpoint is secured using authorization and authentication, then we can store the uploaded configsets with "trusted=true". # Upon creation of a collection using an untrusted configset, any attempt to register a "vulnerable" component, e.g. StatelessScriptUpdateProcessor, XsltUpdateRequestHandler, DataImportHandler etc., should fail with an error that indicates that the configset was not trusted and it can be made trusted by enabling authentication/authorization for the API endpoint and re-uploading the configset. Same error when using a config API command to register any update handler using an untrusted configset. # Ensure that untrusted configsets never overwrite existing trusted configsets. As a separate exercise, we should audit our use of the XML parser to ensure XXE attacks are not possible on XML files, either uploaded from here/elsewhere or loaded from the disk. [~varunrajput], [~anshumg], [~noble.paul], WDYT? was (Author: ichattopadhyaya): I have created a branch jira/solr-6736 with the latest patch (after updating it for master). Regarding the security vulnerability that this new API exposes, I have the following thoughts to take this forward: # We can allow unauthenticated/unauthorized users to upload a configset, but mark such configsets with a "trusted=false" flag while storing in ZK (metadata on the configset's znode). If this endpoint is secured using authorization and authentication, then we can store the uploaded configsets with "trusted=true". # Upon creation of a collection using an untrusted configset, any attempt to register a "vulnerable" component, e.g. StatelessScriptUpdateProcessor, XsltUpdateRequestHandler, DataImportHandler etc., should fail with an error that indicates that the configset was not trusted and it can be made trusted by enabling authentication/authorization for the API endpoint and re-uploading the configset. Same error when using a config API command to register any update handler using an untrusted configset. As a separate exercise, we should audit our use of the XML parser to ensure XXE attacks are not possible on XML files, either uploaded from here/elsewhere or loaded from the disk. [~varunrajput], [~anshumg], [~noble.paul], WDYT? > A collections-like request handler to manage solr configurations on zookeeper > - > > Key: SOLR-6736 > URL: https://issues.apache.org/jira/browse/SOLR-6736 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Varun Rajput >Assignee: Ishan Chattopadhyaya > Attachments: newzkconf.zip, SOLR-6736-newapi.patch, > SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, > test_pub.der, zkconfighandler.zip, zkconfighandler.zip > > > Managing Solr configuration files on zookeeper becomes cumbersome while using > solr in cloud mode, especially while trying out changes in the > configurations. > It will be great if there is a request handler that can provide an API to > manage the configurations similar to the collections handler that would allow > actions like uploading new configurations, linking them to a collection, > deleting configurations, etc. > example : > {code} > #use the following command to upload a new configset called mynewconf. This > will fail if there is alredy a conf called 'mynewconf'. The file could be a > jar , zip or a tar file which contains all the files for the this conf. > curl -X POST -H 'Content-Type: application/octet-stream' --data-binary > @testconf.zip > http://localhost:8983/solr/admin/configs/mynewconf?sig= > {code} > A GET to http://localhost:8983/solr/admin/configs will give a list of configs > available > A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the > list of files in mynewconf -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail:
[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper
[ https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881883#comment-15881883 ] Ishan Chattopadhyaya commented on SOLR-6736: I have created a branch jira/solr-6736 with the latest patch (after updating it for master). Regarding the security vulnerability that this new API exposes, I have the following thoughts to take this forward: # We can allow unauthenticated/unauthorized users to upload a configset, but mark such configsets with a "trusted=false" flag while storing in ZK (metadata on the configset's znode). If this endpoint is secured using authorization and authentication, then we can store the uploaded configsets with "trusted=true". # Upon creation of a collection using an untrusted configset, any attempt to register a "vulnerable" component, e.g. StatelessScriptUpdateProcessor, XsltUpdateRequestHandler, DataImportHandler etc., should fail with an error that indicates that the configset was not trusted and it can be made trusted by enabling authentication/authorization for the API endpoint and re-uploading the configset. Same error when using a config API command to register any update handler using an untrusted configset. As a separate exercise, we should audit our use of the XML parser to ensure XXE attacks are not possible on XML files, either uploaded from here/elsewhere or loaded from the disk. [~varunrajput], [~anshumg], [~noble.paul], WDYT? > A collections-like request handler to manage solr configurations on zookeeper > - > > Key: SOLR-6736 > URL: https://issues.apache.org/jira/browse/SOLR-6736 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Varun Rajput >Assignee: Ishan Chattopadhyaya > Attachments: newzkconf.zip, SOLR-6736-newapi.patch, > SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, > SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, > test_pub.der, zkconfighandler.zip, zkconfighandler.zip > > > Managing Solr configuration files on zookeeper becomes cumbersome while using > solr in cloud mode, especially while trying out changes in the > configurations. > It will be great if there is a request handler that can provide an API to > manage the configurations similar to the collections handler that would allow > actions like uploading new configurations, linking them to a collection, > deleting configurations, etc. > example : > {code} > #use the following command to upload a new configset called mynewconf. This > will fail if there is alredy a conf called 'mynewconf'. The file could be a > jar , zip or a tar file which contains all the files for the this conf. > curl -X POST -H 'Content-Type: application/octet-stream' --data-binary > @testconf.zip > http://localhost:8983/solr/admin/configs/mynewconf?sig= > {code} > A GET to http://localhost:8983/solr/admin/configs will give a list of configs > available > A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the > list of files in mynewconf -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-10156: -- Description: The significantTerms Streaming Expression will emit a set of terms from a *text field* within a doc frequency range for a specific query. It will also score the terms based on how many times the terms appear in the result set, and how many times the terms appear in the corpus, and return the top N terms based on this significance score. Syntax: {code} significantTerms(collection, q="any query", field="some_text_field", minDocFreq="5", //optional default is 5 documents maxDocFreq=".3", // optional default is no more then 30% of the index (.3) minTermLength="4", // optional default is 4 limit="50")// optional default is 20 {code} was: The significantTerms Streaming Expression will emit a set of terms from a *text field* within a doc frequency range for a specific query. It will also score the terms based on how many times the terms appear in the result set, and how many times the terms appear in the corpus, and return the top N terms based on this significance score. Syntax: {code} significantTerms(collection, q="any query", field="some_text_field", minDocFreq="5", //optional default is 5 documents maxDocFreq=".3", // optional default is no more then 30% of the index (.3) minTermlength="4", // optional default is 4 limit="50")// optional default is 20 {code} > Add significantTerms Streaming Expression > - > > Key: SOLR-10156 > URL: https://issues.apache.org/jira/browse/SOLR-10156 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5 > > Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch > > > The significantTerms Streaming Expression will emit a set of terms from a > *text field* within a doc frequency range for a specific query. It will also > score the terms based on how many times the terms appear in the result set, > and how many times the terms appear in the corpus, and return the top N terms > based on this significance score. > Syntax: > {code} > significantTerms(collection, > q="any query", > field="some_text_field", > minDocFreq="5", //optional default is 5 documents > maxDocFreq=".3", // optional default is no more then 30% of > the index (.3) > minTermLength="4", // optional default is 4 > limit="50")// optional default is 20 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-master-Linux (32bit/jdk1.8.0_121) - Build # 19037 - Unstable!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/19037/ Java: 32bit/jdk1.8.0_121 -client -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.solr.handler.extraction.TestExtractionDateUtil.testParseDate Error Message: Incorrect parsed timestamp: 1226583351000 != 1226579751000 (Thu Nov 13 04:35:51 AKST 2008) Stack Trace: java.lang.AssertionError: Incorrect parsed timestamp: 1226583351000 != 1226579751000 (Thu Nov 13 04:35:51 AKST 2008) at __randomizedtesting.SeedInfo.seed([48772F467DACD6:4A510F1A3DD4DB63]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.handler.extraction.TestExtractionDateUtil.assertParsedDate(TestExtractionDateUtil.java:59) at org.apache.solr.handler.extraction.TestExtractionDateUtil.testParseDate(TestExtractionDateUtil.java:54) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) at java.lang.Thread.run(Thread.java:745) Build Log: [...truncated 18343 lines...] [junit4] Suite: org.apache.solr.handler.extraction.TestExtractionDateUtil [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestExtractionDateUtil -Dtests.method=testParseDate -Dtests.seed=48772F467DACD6 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=sr-ME -Dtests.timezone=America/Metlakatla -Dtests.asserts=true
[jira] [Updated] (SOLR-9835) Create another replication mode for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cao Manh Dat updated SOLR-9835: --- Attachment: (was: SOLR-9835.patch) > Create another replication mode for SolrCloud > - > > Key: SOLR-9835 > URL: https://issues.apache.org/jira/browse/SOLR-9835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat >Assignee: Shalin Shekhar Mangar > Attachments: SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, > SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, > SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, > SOLR-9835.patch > > > The current replication mechanism of SolrCloud is called state machine, which > replicas start in same initial state and for each input, the input is > distributed across replicas so all replicas will end up with same next state. > But this type of replication have some drawbacks > - The commit (which costly) have to run on all replicas > - Slow recovery, because if replica miss more than N updates on its down > time, the replica have to download entire index from its leader. > So we create create another replication mode for SolrCloud called state > transfer, which acts like master/slave replication. In basically > - Leader distribute the update to other replicas, but the leader only apply > the update to IW, other replicas just store the update to UpdateLog (act like > replication). > - Replicas frequently polling the latest segments from leader. > Pros: > - Lightweight for indexing, because only leader are running the commit, > updates. > - Very fast recovery, replicas just have to download the missing segments. > On CAP point of view, this ticket will trying to promise to end users a > distributed systems : > - Partition tolerance > - Weak Consistency for normal query : clusters can serve stale data. This > happen when leader finish a commit and slave is fetching for latest segment. > This period can at most {{pollInterval + time to fetch latest segment}}. > - Consistency for RTG : if we *do not use DQBs*, replicas will consistence > with master just like original SolrCloud mode > - Weak Availability : just like original SolrCloud mode. If a leader down, > client must wait until new leader being elected. > To use this new replication mode, a new collection must be created with an > additional parameter {{liveReplicas=1}} > {code} > http://localhost:8983/solr/admin/collections?action=CREATE=newCollection=2=1=1 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9835) Create another replication mode for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cao Manh Dat updated SOLR-9835: --- Attachment: SOLR-9835.patch Updated patch based on comments of [~shalinmangar] and [~ichattopadhyaya] bq. 2. ZkController.register method – The condition for !isLeader && onlyLeaderIndexes can be replaced by the isReplicaInOnlyLeaderIndexes variable. Done! bq. 3. Since there is no log replay on startup on replicas anymore, what if the replica is killed (which keeps its state as 'active' in ZK) and then the cluster is restarted and the replica becomes leader candidate? If we do not replay the discarded log then it could lead to data loss? To solve this problem, we call {{copyOverOldUpdates}} from the last recent tlog on startup. bq. 4. UpdateLog – Can you please add javadocs outlining the motivation/purpose of the new methods such as copyOverBufferingUpdates and switchToNewTlog e.g. why does switchToNewTlog require copying over some updates from the old tlog? Done! bq. 6. UpdateLog – why does copyOverBufferUpdates block updates while calling switchToNewTlog but ReplicateFromLeader doesn't? How are they both safe? Both of them are blocking updates now. bq. 8. ZkController.startReplicationFromLeader – Using a ConcurrentHashMap is not enough to prevent two simultaneous replications from happening concurrently. You should use the atomic putIfAbsent to put a core to the map before starting replication. Done! bq.Also, lets add a simple test to ensure that in-place updates work on a replica I modified {{TestInPlaceUpdatesDistrib}} to run in random mode. If the tests run on the new mode, we will skip some outOfOderDBQs tests. > Create another replication mode for SolrCloud > - > > Key: SOLR-9835 > URL: https://issues.apache.org/jira/browse/SOLR-9835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat >Assignee: Shalin Shekhar Mangar > Attachments: SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, > SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, > SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, > SOLR-9835.patch > > > The current replication mechanism of SolrCloud is called state machine, which > replicas start in same initial state and for each input, the input is > distributed across replicas so all replicas will end up with same next state. > But this type of replication have some drawbacks > - The commit (which costly) have to run on all replicas > - Slow recovery, because if replica miss more than N updates on its down > time, the replica have to download entire index from its leader. > So we create create another replication mode for SolrCloud called state > transfer, which acts like master/slave replication. In basically > - Leader distribute the update to other replicas, but the leader only apply > the update to IW, other replicas just store the update to UpdateLog (act like > replication). > - Replicas frequently polling the latest segments from leader. > Pros: > - Lightweight for indexing, because only leader are running the commit, > updates. > - Very fast recovery, replicas just have to download the missing segments. > On CAP point of view, this ticket will trying to promise to end users a > distributed systems : > - Partition tolerance > - Weak Consistency for normal query : clusters can serve stale data. This > happen when leader finish a commit and slave is fetching for latest segment. > This period can at most {{pollInterval + time to fetch latest segment}}. > - Consistency for RTG : if we *do not use DQBs*, replicas will consistence > with master just like original SolrCloud mode > - Weak Availability : just like original SolrCloud mode. If a leader down, > client must wait until new leader being elected. > To use this new replication mode, a new collection must be created with an > additional parameter {{liveReplicas=1}} > {code} > http://localhost:8983/solr/admin/collections?action=CREATE=newCollection=2=1=1 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9835) Create another replication mode for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cao Manh Dat updated SOLR-9835: --- Attachment: SOLR-9835.patch > Create another replication mode for SolrCloud > - > > Key: SOLR-9835 > URL: https://issues.apache.org/jira/browse/SOLR-9835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat >Assignee: Shalin Shekhar Mangar > Attachments: SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, > SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, > SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, > SOLR-9835.patch > > > The current replication mechanism of SolrCloud is called state machine, which > replicas start in same initial state and for each input, the input is > distributed across replicas so all replicas will end up with same next state. > But this type of replication have some drawbacks > - The commit (which costly) have to run on all replicas > - Slow recovery, because if replica miss more than N updates on its down > time, the replica have to download entire index from its leader. > So we create create another replication mode for SolrCloud called state > transfer, which acts like master/slave replication. In basically > - Leader distribute the update to other replicas, but the leader only apply > the update to IW, other replicas just store the update to UpdateLog (act like > replication). > - Replicas frequently polling the latest segments from leader. > Pros: > - Lightweight for indexing, because only leader are running the commit, > updates. > - Very fast recovery, replicas just have to download the missing segments. > On CAP point of view, this ticket will trying to promise to end users a > distributed systems : > - Partition tolerance > - Weak Consistency for normal query : clusters can serve stale data. This > happen when leader finish a commit and slave is fetching for latest segment. > This period can at most {{pollInterval + time to fetch latest segment}}. > - Consistency for RTG : if we *do not use DQBs*, replicas will consistence > with master just like original SolrCloud mode > - Weak Availability : just like original SolrCloud mode. If a leader down, > client must wait until new leader being elected. > To use this new replication mode, a new collection must be created with an > additional parameter {{liveReplicas=1}} > {code} > http://localhost:8983/solr/admin/collections?action=CREATE=newCollection=2=1=1 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8182) TestSolrCloudWithKerberosAlt fails consistently on JDK9
[ https://issues.apache.org/jira/browse/SOLR-8182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881732#comment-15881732 ] Hoss Man commented on SOLR-8182: It's not clear to me if the initially reported test failures (pre-jigsaw) was a JVM bug that's been fixed in more recent Java9 EA builds, or if the underlying probably still exists (either in the JVM or in Solr) but we never get that far because of jigsaw related failures. We almost certainly won't know the answer until SOLR-8052 and SOLR-10199 are resolved, so marking this bug as bocked by both of those. > TestSolrCloudWithKerberosAlt fails consistently on JDK9 > --- > > Key: SOLR-8182 > URL: https://issues.apache.org/jira/browse/SOLR-8182 > Project: Solr > Issue Type: Test > Components: security, SolrCloud >Reporter: Shalin Shekhar Mangar >Priority: Minor > Labels: Java9 > Fix For: 5.5, 6.0 > > > The test fails consistently on JDK9 with the following initialization error: > {code} > FAILED: org.apache.solr.cloud.TestSolrCloudWithKerberosAlt.testBasics > Error Message: > org.apache.directory.api.ldap.model.exception.LdapOtherException: > ERR_04447_CANNOT_NORMALIZE_VALUE Cannot normalize the wrapped value > ERR_04473_NOT_VALID_VALUE Not a valid value '20090818022733Z' for the > AttributeType 'ATTRIBUTE_TYPE ( 1.3.6.1.4.1.18060.0.4.1.2.35 NAME > 'schemaModifyTimestamp' DESC time which schema was modified SUP > modifyTimestamp EQUALITY generalizedTimeMatch ORDERING > generalizedTimeOrderingMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.24 USAGE > directoryOperation ) ' > Stack Trace: > org.apache.directory.api.ldap.model.exception.LdapOtherException: > org.apache.directory.api.ldap.model.exception.LdapOtherException: > ERR_04447_CANNOT_NORMALIZE_VALUE Cannot normalize the wrapped value > ERR_04473_NOT_VALID_VALUE Not a valid value '20090818022733Z' for the > AttributeType 'ATTRIBUTE_TYPE ( 1.3.6.1.4.1.18060.0.4.1.2.35 > NAME 'schemaModifyTimestamp' > DESC time which schema was modified > SUP modifyTimestamp > EQUALITY generalizedTimeMatch > ORDERING generalizedTimeOrderingMatch > SYNTAX 1.3.6.1.4.1.1466.115.121.1.24 > USAGE directoryOperation > ) > ' > at > __randomizedtesting.SeedInfo.seed([321A63D948BF59B7:FC2CDF5705107C7]:0) > at > org.apache.directory.server.core.api.partition.AbstractPartition.initialize(AbstractPartition.java:84) > at > org.apache.directory.server.core.DefaultDirectoryService.initialize(DefaultDirectoryService.java:1808) > at > org.apache.directory.server.core.DefaultDirectoryService.startup(DefaultDirectoryService.java:1248) > at > org.apache.hadoop.minikdc.MiniKdc.initDirectoryService(MiniKdc.java:383) > at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:319) > at > org.apache.solr.cloud.TestSolrCloudWithKerberosAlt.setupMiniKdc(TestSolrCloudWithKerberosAlt.java:105) > at > org.apache.solr.cloud.TestSolrCloudWithKerberosAlt.setUp(TestSolrCloudWithKerberosAlt.java:94) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8052) Tests using MiniKDC do not work with Java 9 Jigsaw
[ https://issues.apache.org/jira/browse/SOLR-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881720#comment-15881720 ] Hoss Man commented on SOLR-8052: Created SOLR-10199 to track the *non-test* code problems with using Solr's kerberos features in Java9 (discovered with this patch) > Tests using MiniKDC do not work with Java 9 Jigsaw > -- > > Key: SOLR-8052 > URL: https://issues.apache.org/jira/browse/SOLR-8052 > Project: Solr > Issue Type: Bug > Components: Authentication >Affects Versions: 5.3 >Reporter: Uwe Schindler > Labels: Java9 > Attachments: SOLR-8052.patch > > > As described in my status update yesterday, there are some problems in > dependencies shipped with Solr that don't work with Java 9 Jigsaw builds. > org.apache.solr.cloud.SaslZkACLProviderTest.testSaslZkACLProvider > {noformat} >[junit4]> Throwable #1: java.lang.RuntimeException: > java.lang.IllegalAccessException: Class org.apache.hadoop.minikdc.MiniKdc can > not access a member of class sun.security.krb5.Config (module > java.security.jgss) with modifiers "public static", module java.security.jgss > does not export sun.security.krb5 to >[junit4]>at > org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.run(SaslZkACLProviderTest.java:211) >[junit4]>at > org.apache.solr.cloud.SaslZkACLProviderTest.setUp(SaslZkACLProviderTest.java:81) >[junit4]>at java.lang.Thread.run(java.base@9.0/Thread.java:746) >[junit4]> Caused by: java.lang.IllegalAccessException: Class > org.apache.hadoop.minikdc.MiniKdc can not access a member of class > sun.security.krb5.Config (module java.security.jgss) with modifiers "public > static", module java.security.jgss does not export sun.security.krb5 to > >[junit4]>at > java.lang.reflect.AccessibleObject.slowCheckMemberAccess(java.base@9.0/AccessibleObject.java:384) >[junit4]>at > java.lang.reflect.AccessibleObject.checkAccess(java.base@9.0/AccessibleObject.java:376) >[junit4]>at > org.apache.hadoop.minikdc.MiniKdc.initKDCServer(MiniKdc.java:478) >[junit4]>at > org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:320) >[junit4]>at > org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.run(SaslZkACLProviderTest.java:204) >[junit4]>... 38 moreThrowable #2: > java.lang.NullPointerException >[junit4]>at > org.apache.solr.cloud.ZkTestServer$ZKServerMain.shutdown(ZkTestServer.java:334) >[junit4]>at > org.apache.solr.cloud.ZkTestServer.shutdown(ZkTestServer.java:526) >[junit4]>at > org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.shutdown(SaslZkACLProviderTest.java:218) >[junit4]>at > org.apache.solr.cloud.SaslZkACLProviderTest.tearDown(SaslZkACLProviderTest.java:116) >[junit4]>at java.lang.Thread.run(java.base@9.0/Thread.java:746) > {noformat} > This is really bad, bad, bad! All security related stuff should never ever be > reflected on! > So we have to open issue in MiniKdc project so they remove the "hacks". > Elasticsearch had > similar problems with Amazon's AWS API. The worked around with a funny hack > in their SecurityPolicy > (https://github.com/elastic/elasticsearch/pull/13538). But as Solr does not > run with SecurityManager > in production, there is no way to do that. > We should report issue on the MiniKdc project, so they fix their code and > remove the really bad reflection on Java's internal classes. > FYI, my > [conclusion|http://mail-archives.apache.org/mod_mbox/lucene-dev/201509.mbox/%3C014801d0ee23%245c8f5df0%2415ae19d0%24%40thetaphi.de%3E] > from yesterday. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-10199) Solr's Kerberos functionaliy does not work in Java9 due to dependency on hadoop's AuthenticationFilter which attempt access to JVM protected classes
Hoss Man created SOLR-10199: --- Summary: Solr's Kerberos functionaliy does not work in Java9 due to dependency on hadoop's AuthenticationFilter which attempt access to JVM protected classes Key: SOLR-10199 URL: https://issues.apache.org/jira/browse/SOLR-10199 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Hoss Man (discovered this while working on test improvements for SOLR-8052) Our Kerberos based authn/authz features are all built on top of Hadoop's {{AuthenticationFilter}} which in turn uses Hadoop's {{KerberosUtil}} -- but this does not work on Java9/jigsaw JVMs because that class in turn attempts to access {{sun.security.jgss.GSSUtil}} which is not exported by {{module java.security.jgss}} This means that Solr users who depend on Kerberos will not be able to upgrade to Java9, even if they do not use any Hadoop specific features of Solr. Example log messages... {noformat} [junit4] 2> 6833 WARN (qtp442059499-30) [] o.a.h.s.a.s.AuthenticationFilter Authentication exception: java.lang.IllegalAccessException: class org.apache.hadoop.security.authentication.util.KerberosUtil cannot access class sun.security.jgss.GSSUtil (in module java.security.jgss) because module java.security.jgss does not export sun.security.jgss to unnamed module @4b38fe8b [junit4] 2> 6841 WARN (TEST-TestSolrCloudWithKerberosAlt.testBasics-seed#[95A583AF82D1EBBE]) [] o.a.h.c.p.ResponseProcessCookies Invalid cookie header: "Set-Cookie: hadoop.auth=; Path=/; Domain=127.0.0.1; Expires=Ara, 01-Sa-1970 00:00:00 GMT; HttpOnly". Invalid 'expires' attribute: Ara, 01-Sa-1970 00:00:00 GMT {noformat} (NOTE: HADOOP-14115 is cause of malformed cookie expiration) ultimately the client gets a 403 error (as seen in a testcase with patch from SOLR-8052 applied and java9 assume commented out)... {noformat} [junit4] ERROR 7.10s | TestSolrCloudWithKerberosAlt.testBasics <<< [junit4]> Throwable #1: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:34687/solr: Expected mime type application/octet-stream but got text/html. [junit4]> [junit4]> [junit4]> Error 403 [junit4]> [junit4]> [junit4]> HTTP ERROR: 403 [junit4]> Problem accessing /solr/admin/collections. Reason: [junit4]> java.lang.IllegalAccessException: class org.apache.hadoop.security.authentication.util.KerberosUtil cannot access class sun.security.jgss.GSSUtil (in module java.security.jgss) because module java.security.jgss does not export sun.security.jgss to unnamed module @4b38fe8b [junit4]> http://eclipse.org/jetty;>Powered by Jetty:// 9.3.14.v20161028 [junit4]> [junit4]> {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8052) Tests using MiniKDC do not work with Java 9 Jigsaw
[ https://issues.apache.org/jira/browse/SOLR-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-8052: --- Attachment: SOLR-8052.patch I've been digging into some of the Java9 related SOLR jiras -- starting with the kerberos based test problems -- to try and figure out if these really are test only bugs and/or if there is anything we can do about making things work better. Based on my initial reading/experimenting, I think we should replace MiniKdc (from hadoop's test infrastrcture) with SimpleKdcServer (from the apache kerby project)... * SimpleKdcServer does not appear to have reflection related bugs that cause problems under jigsaw like MiniKdc does * SimpleKdcServer does not suffer from the same "can't use multiple nodes" problem (HADOOP-9893) that has required {{TestMiniSolrCloudClusterKerberos}} to be {{@Ignored}} since it was created. ** I was able to add multiple solr nodes to {{TestSolrCloudWithKerberosAlt}} w/o problems after switching ** With a few other modifications, I was able to get {{TestMiniSolrCloudClusterKerberos}} to work as well (details below) * In hadoop's master branch, MiniKdc has been refactored to use SimpleKdcServer internally anyway Doing this isn't a silver bullet for the java9/jigsaw related failures (I'll file a new but about that), but it should help us move forward -- and in general seems like an improvement. The attached patch is a starting point for this change. One thing I'm not particularly happy with here is that in order to get it to pass, I _had_ to modify {{TestMiniSolrCloudClusterKerberos}} to create a single {{KerberosTestServices}} instance in the {{@BeforeClass}} method, instead of in a regular {{@Before}} method. *In and of itself, this change isn't neccessarily bad -- it just means we only start one Kerberos server instead of one per method.* What concerns me is that w/o this change, only the first test method would ever pass, and subsequent test methods would log/throw errors from ZK -- and running any single test method with {{-Dtests.method}} would (seemingly) always pass. My initial suspicion was that something in {{SimpleKdcServer}}, or in our {{KerberosTestServices}} wrapper, wasn't "resetting" the JVM security settings correctly when we shut it down -- but if that were the case I would expect something like {{ant test -Dtests.jvms=1 -Dtests.class=\*Kerber\*}} to fail (even with {{KerberosTestServices}} only ever being instnatiated once per test class) when the (sole) Test JVM got to the second test class and instantated a second {{KerberosTestServices}} instance. However that doesn't seem to be the case. For some reason, using only one {{KerberosTestServices}} in a test class is fine, regardless of how many test classes using kerberos run in that JVM, but using multiple {{KerberosTestServices}} in a single test class causes kerberos failures. For the purposes of demonstrating this (in contrast with the changes made in {{TestMiniSolrCloudClusterKerberos}} which seem like a good idea either way) I've added a {{TestHossSanity}} which works just like {{TestMiniSolrCloudClusterKerberos}} except initializes {{KerberosTestServices}} on a per test-method basis. Examples of the types of Kerberose errors it logs (after the first test method succeeds)... {noformat} ... [junit4] says 你好! Master seed: 6BEDD90DB0D4DC38 [junit4] Executing 1 suite with 1 JVM. [junit4] [junit4] Started J0 PID(11220@tray). [junit4] Suite: org.apache.solr.cloud.TestHossSanity [junit4] 2> 0INFO (TEST-TestHossSanity.testStopAllStartAll-seed#[6BEDD90DB0D4DC38]) [] o.a.s.c.MiniSolrCloudCluster Starting cluster of 5 servers in /home/hossman/lucene/dev/solr/build/solr-core/test/J0/temp/solr.cloud.TestHossSanity_6BEDD90DB0D4DC38-001/tempDir-002 [junit4] 2> 11 INFO (TEST-TestHossSanity.testStopAllStartAll-seed#[6BEDD90DB0D4DC38]) [] o.a.s.c.ZkTestServer STARTING ZK TEST SERVER ...first test (testStopAllStartAll) proceeds and runs fine... [junit4] OK 31.2s | TestHossSanity.testStopAllStartAll [junit4] 2> 30989 INFO (TEST-TestHossSanity.testCollectionCreateWithoutCoresThenDelete-seed#[6BEDD90DB0D4DC38]) [] o.a.s.c.MiniSolrCloudCluster Starting cluster of 5 servers in /home/hossman/lucene/dev/solr/build/solr-core/test/J0/temp/solr.cloud.TestHossSanity_6BEDD90DB0D4DC38-001/tempDir-004 [junit4] 2> 30989 INFO (TEST-TestHossSanity.testCollectionCreateWithoutCoresThenDelete-seed#[6BEDD90DB0D4DC38]) [] o.a.s.c.ZkTestServer STARTING ZK TEST SERVER ... [junit4] 2> 30989 INFO (Thread-100) [] o.a.s.c.ZkTestServer client port:0.0.0.0/0.0.0.0:0 [junit4] 2> 30990 INFO (Thread-100) [] o.a.s.c.ZkTestServer Starting server [junit4] 2> 30995 INFO (pool-11-thread-1) [] o.a.k.k.k.s.r.KdcRequest Client entry is empty. [junit4] 2> 30995 INFO (pool-11-thread-1) [
[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-10156: -- Description: The significantTerms Streaming Expression will emit a set of terms from a *text field* within a doc frequency range for a specific query. It will also score the terms based on how many times the terms appear in the result set, and how many times the terms appear in the corpus, and return the top N terms based on this significance score. Syntax: {code} significantTerms(collection, q="any query", field="some_text_field", minDocFreq="5", //optional default is 5 documents maxDocFreq=".3", // optional default is no more then 30% of the index (.3) minTermlength="4", // optional default is 4 limit="50")// optional default is 20 {code} was: The significantTerms Streaming Expression will emit a set of terms from a *text field* within a doc frequency range for a specific query. It will also score the terms based on how many times the terms appear in the result set, and how many times the terms appear in the corpus, and return the top N terms based on this significance score. Syntax: {code} significantTerms(collection, q="any query", field="some_text_field", minDocFreq="5", maxDocFreq=".3", minTermlength="4", limit="50") {code} > Add significantTerms Streaming Expression > - > > Key: SOLR-10156 > URL: https://issues.apache.org/jira/browse/SOLR-10156 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5 > > Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch > > > The significantTerms Streaming Expression will emit a set of terms from a > *text field* within a doc frequency range for a specific query. It will also > score the terms based on how many times the terms appear in the result set, > and how many times the terms appear in the corpus, and return the top N terms > based on this significance score. > Syntax: > {code} > significantTerms(collection, >q="any query", >field="some_text_field", >minDocFreq="5", //optional default is 5 documents >maxDocFreq=".3", // optional default is no more > then 30% of the index (.3) >minTermlength="4", // optional default is 4 >limit="50")// optional default is > 20 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-10156: -- Description: The significantTerms Streaming Expression will emit a set of terms from a *text field* within a doc frequency range for a specific query. It will also score the terms based on how many times the terms appear in the result set, and how many times the terms appear in the corpus, and return the top N terms based on this significance score. Syntax: {code} significantTerms(collection, q="any query", field="some_text_field", minDocFreq="5", //optional default is 5 documents maxDocFreq=".3", // optional default is no more then 30% of the index (.3) minTermlength="4", // optional default is 4 limit="50")// optional default is 20 {code} was: The significantTerms Streaming Expression will emit a set of terms from a *text field* within a doc frequency range for a specific query. It will also score the terms based on how many times the terms appear in the result set, and how many times the terms appear in the corpus, and return the top N terms based on this significance score. Syntax: {code} significantTerms(collection, q="any query", field="some_text_field", minDocFreq="5", //optional default is 5 documents maxDocFreq=".3", // optional default is no more then 30% of the index (.3) minTermlength="4", // optional default is 4 limit="50")// optional default is 20 {code} > Add significantTerms Streaming Expression > - > > Key: SOLR-10156 > URL: https://issues.apache.org/jira/browse/SOLR-10156 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5 > > Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch > > > The significantTerms Streaming Expression will emit a set of terms from a > *text field* within a doc frequency range for a specific query. It will also > score the terms based on how many times the terms appear in the result set, > and how many times the terms appear in the corpus, and return the top N terms > based on this significance score. > Syntax: > {code} > significantTerms(collection, > q="any query", > field="some_text_field", > minDocFreq="5", //optional default is 5 documents > maxDocFreq=".3", // optional default is no more then 30% of > the index (.3) > minTermlength="4", // optional default is 4 > limit="50")// optional default is 20 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-10156: -- Description: The significantTerms Streaming Expression will emit a set of terms from a *text field* within a doc frequency range for a specific query. It will also score the terms based on how many times the terms appear in the result set, and how many times the terms appear in the corpus, and return the top N terms based on this significance score. Syntax: {code} significantTerms(collection, q="any query", field="some_text_field", minDocFreq="5", //optional default is 5 documents maxDocFreq=".3", // optional default is no more then 30% of the index (.3) minTermlength="4", // optional default is 4 limit="50")// optional default is 20 {code} was: The significantTerms Streaming Expression will emit a set of terms from a *text field* within a doc frequency range for a specific query. It will also score the terms based on how many times the terms appear in the result set, and how many times the terms appear in the corpus, and return the top N terms based on this significance score. Syntax: {code} significantTerms(collection, q="any query", field="some_text_field", minDocFreq="5", //optional default is 5 documents maxDocFreq=".3", // optional default is no more then 30% of the index (.3) minTermlength="4", // optional default is 4 limit="50")// optional default is 20 {code} > Add significantTerms Streaming Expression > - > > Key: SOLR-10156 > URL: https://issues.apache.org/jira/browse/SOLR-10156 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5 > > Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch > > > The significantTerms Streaming Expression will emit a set of terms from a > *text field* within a doc frequency range for a specific query. It will also > score the terms based on how many times the terms appear in the result set, > and how many times the terms appear in the corpus, and return the top N terms > based on this significance score. > Syntax: > {code} > significantTerms(collection, > q="any query", >field="some_text_field", >minDocFreq="5", //optional default is 5 documents >maxDocFreq=".3", // optional default is no more > then 30% of the index (.3) >minTermlength="4", // optional default is 4 >limit="50")// optional default is > 20 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-10156: -- Description: The significantTerms Streaming Expression will emit a set of terms from a *text field* within a doc frequency range for a specific query. It will also score the terms based on how many times the terms appear in the result set, and how many times the terms appear in the corpus, and return the top N terms based on this significance score. Syntax: {code} significantTerms(collection, q="any query", field="some_text_field", minDocFreq="5", //optional default is 5 documents maxDocFreq=".3", // optional default is no more then 30% of the index (.3) minTermlength="4", // optional default is 4 limit="50")// optional default is 20 {code} was: The significantTerms Streaming Expression will emit a set of terms from a *text field* within a doc frequency range for a specific query. It will also score the terms based on how many times the terms appear in the result set, and how many times the terms appear in the corpus, and return the top N terms based on this significance score. Syntax: {code} significantTerms(collection, q="any query", field="some_text_field", minDocFreq="5", //optional default is 5 documents maxDocFreq=".3", // optional default is no more then 30% of the index (.3) minTermlength="4", // optional default is 4 limit="50")// optional default is 20 {code} > Add significantTerms Streaming Expression > - > > Key: SOLR-10156 > URL: https://issues.apache.org/jira/browse/SOLR-10156 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5 > > Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch > > > The significantTerms Streaming Expression will emit a set of terms from a > *text field* within a doc frequency range for a specific query. It will also > score the terms based on how many times the terms appear in the result set, > and how many times the terms appear in the corpus, and return the top N terms > based on this significance score. > Syntax: > {code} > significantTerms(collection, > q="any query", >field="some_text_field", >minDocFreq="5", //optional default is 5 documents >maxDocFreq=".3", // optional default is no more > then 30% of the index (.3) >minTermlength="4", // optional default is 4 >limit="50")// optional default is > 20 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-10156: -- Description: The significantTerms Streaming Expression will emit a set of terms from a *text field* within a doc frequency range for a specific query. It will also score the terms based on how many times the terms appear in the result set, and how many times the terms appear in the corpus, and return the top N terms based on this significance score. Syntax: {code} significantTerms(collection, q="any query", field="some_text_field", minDocFreq="5", maxDocFreq=".3", minTermlength="4", limit="50") {code} was: The significantTerms Streaming Expression will emit a set of terms from a *text field* within a doc frequency range for a specific query. It will also score the terms based on how many times the terms appear in the result set, and how many times the terms appear in the corpus, and return the top N terms based on this significance score. Syntax: {code} significantTerms(collection, q="abc", field="some_text_field", minDocFreq="x", maxDocFreq="y", limit="50") {code} > Add significantTerms Streaming Expression > - > > Key: SOLR-10156 > URL: https://issues.apache.org/jira/browse/SOLR-10156 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5 > > Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch > > > The significantTerms Streaming Expression will emit a set of terms from a > *text field* within a doc frequency range for a specific query. It will also > score the terms based on how many times the terms appear in the result set, > and how many times the terms appear in the corpus, and return the top N terms > based on this significance score. > Syntax: > {code} > significantTerms(collection, >q="any query", >field="some_text_field", >minDocFreq="5", >maxDocFreq=".3", >minTermlength="4", >limit="50") > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7705) Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the max token length
[ https://issues.apache.org/jira/browse/LUCENE-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881610#comment-15881610 ] Erick Erickson commented on LUCENE-7705: These two tests fail: org.apache.lucene.analysis.core.TestUnicodeWhitespaceTokenizer.testParamsFactory org.apache.lucene.analysis.core.TestRandomChains (suite) TestUnicideWhitespaceTokenizer is because I added "to" to one of the exception messages, a trivial fix No idea what's happening with TestRandomChains though. And my nocommit insures that 'ant precommit' will fail too, that's rather the point. > Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the > max token length > - > > Key: LUCENE-7705 > URL: https://issues.apache.org/jira/browse/LUCENE-7705 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Amrit Sarkar >Assignee: Erick Erickson >Priority: Minor > Attachments: LUCENE-7705.patch > > > SOLR-10186 > [~erickerickson]: Is there a good reason that we hard-code a 256 character > limit for the CharTokenizer? In order to change this limit it requires that > people copy/paste the incrementToken into some new class since incrementToken > is final. > KeywordTokenizer can easily change the default (which is also 256 bytes), but > to do so requires code rather than being able to configure it in the schema. > For KeywordTokenizer, this is Solr-only. For the CharTokenizer classes > (WhitespaceTokenizer, UnicodeWhitespaceTokenizer and LetterTokenizer) > (Factories) it would take adding a c'tor to the base class in Lucene and > using it in the factory. > Any objections? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-10186) Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the max token length
[ https://issues.apache.org/jira/browse/SOLR-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reassigned SOLR-10186: - Assignee: Erick Erickson > Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the > max token length > - > > Key: SOLR-10186 > URL: https://issues.apache.org/jira/browse/SOLR-10186 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Minor > Attachments: SOLR-10186.patch, SOLR-10186.patch, SOLR-10186.patch > > > Is there a good reason that we hard-code a 256 character limit for the > CharTokenizer? In order to change this limit it requires that people > copy/paste the incrementToken into some new class since incrementToken is > final. > KeywordTokenizer can easily change the default (which is also 256 bytes), but > to do so requires code rather than being able to configure it in the schema. > For KeywordTokenizer, this is Solr-only. For the CharTokenizer classes > (WhitespaceTokenizer, UnicodeWhitespaceTokenizer and LetterTokenizer) > (Factories) it would take adding a c'tor to the base class in Lucene and > using it in the factory. > Any objections? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-7705) Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the max token length
[ https://issues.apache.org/jira/browse/LUCENE-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reassigned LUCENE-7705: -- Assignee: Erick Erickson > Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the > max token length > - > > Key: LUCENE-7705 > URL: https://issues.apache.org/jira/browse/LUCENE-7705 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Amrit Sarkar >Assignee: Erick Erickson >Priority: Minor > Attachments: LUCENE-7705.patch > > > SOLR-10186 > [~erickerickson]: Is there a good reason that we hard-code a 256 character > limit for the CharTokenizer? In order to change this limit it requires that > people copy/paste the incrementToken into some new class since incrementToken > is final. > KeywordTokenizer can easily change the default (which is also 256 bytes), but > to do so requires code rather than being able to configure it in the schema. > For KeywordTokenizer, this is Solr-only. For the CharTokenizer classes > (WhitespaceTokenizer, UnicodeWhitespaceTokenizer and LetterTokenizer) > (Factories) it would take adding a c'tor to the base class in Lucene and > using it in the factory. > Any objections? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10194) Unable to use the UninvertedField implementation with legacy facets
[ https://issues.apache.org/jira/browse/SOLR-10194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881542#comment-15881542 ] Shawn Heisey commented on SOLR-10194: - This could be part of an issue where the description is just "Solr 6.x performance is much worse than Solr 4.x performance." This statement is particularly true when facets (and probably grouping) are involved. For the person who filed this issue (who I have been talking to via IRC), enabling docValues and reindexing makes performance worse, not better. > Unable to use the UninvertedField implementation with legacy facets > --- > > Key: SOLR-10194 > URL: https://issues.apache.org/jira/browse/SOLR-10194 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.2, 6.3, 6.4.1 > Environment: Linux >Reporter: Victor Igumnov >Priority: Minor > Labels: easyfix > > FacetComponent's method "modifyRequestForFieldFacets" modifies the > distributed facet request and sets the mincount count to zero which then the > SimpleFacets implementation is unable to get into the UIF code block when > facet.method=uif is applied. The workaround which I found is to use > facet.distrib.mco=true which sets the mincount to one instead of zero. > Working: > http://somehost:9100/solr/collection/select?facet.method=uif=attribute=*:*=true=true=true > > None-Working: > http://somehost:9100/solr/collection/select?facet.method=uif=attribute=*:*=true=true=false > Semi-working when it isn't a distributed call: > http://somehost:9100/solr/collection/select?facet.method=uif=attribute=*:*=true=true=false=false > Just make sure to run it on a multi-shard setup. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7705) Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the max token length
[ https://issues.apache.org/jira/browse/LUCENE-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881459#comment-15881459 ] Erick Erickson commented on LUCENE-7705: I think the patch I uploaded is a result of applying your most recent patch for SOLR-10186 but can you verify? We should probably consolidate two, I suggest we close the Solr one as a duplicate and continue iterating here. Erick > Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the > max token length > - > > Key: LUCENE-7705 > URL: https://issues.apache.org/jira/browse/LUCENE-7705 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Amrit Sarkar >Priority: Minor > Attachments: LUCENE-7705.patch > > > SOLR-10186 > [~erickerickson]: Is there a good reason that we hard-code a 256 character > limit for the CharTokenizer? In order to change this limit it requires that > people copy/paste the incrementToken into some new class since incrementToken > is final. > KeywordTokenizer can easily change the default (which is also 256 bytes), but > to do so requires code rather than being able to configure it in the schema. > For KeywordTokenizer, this is Solr-only. For the CharTokenizer classes > (WhitespaceTokenizer, UnicodeWhitespaceTokenizer and LetterTokenizer) > (Factories) it would take adding a c'tor to the base class in Lucene and > using it in the factory. > Any objections? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-6.x-Windows (64bit/jdk1.8.0_121) - Build # 747 - Unstable!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Windows/747/ Java: 64bit/jdk1.8.0_121 -XX:-UseCompressedOops -XX:+UseG1GC 2 tests failed. FAILED: org.apache.solr.cloud.OverseerRolesTest.testOverseerRole Error Message: Timed out waiting for overseer state change Stack Trace: java.lang.AssertionError: Timed out waiting for overseer state change at __randomizedtesting.SeedInfo.seed([742460D0CAF956E6:95EF9D44F14A6037]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.cloud.OverseerRolesTest.waitForNewOverseer(OverseerRolesTest.java:62) at org.apache.solr.cloud.OverseerRolesTest.testOverseerRole(OverseerRolesTest.java:140) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) at java.lang.Thread.run(Thread.java:745) FAILED: org.apache.solr.cloud.ShardSplitTest.testSplitWithChaosMonkey Error Message: There are still nodes recoverying - waited for 330 seconds Stack Trace: java.lang.AssertionError: There are still nodes recoverying - waited for 330
[jira] [Updated] (LUCENE-7705) Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the max token length
[ https://issues.apache.org/jira/browse/LUCENE-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated LUCENE-7705: --- Attachment: LUCENE-7705.patch Patch that fixes up a few comments, regularized maxChars* to maxToken* and the like. I enhanced a test to test tokens longer than 256 characters. There was a problem with LowerCaseTokenizerFactory, the getMultiTermComponent method constructed a LowerCaseFilterFactory with the _original_ arguments including maxTokenLen, which then threw an error. There's a nocommit in there for the nonce, what's the right thing to do here? [~amrit sarkar] Do you have any ideas for a more elegant solution? The nocommit is there because this is feels just too hacky, but it does prove that this is the problem. It seems like we should close SOLR-10186 and just make the code changes here. With this patch I successfully tested adding fields with tokens longer than 256 and shorter, so I don't think there's anything beyond this patch to do with Solr. I suppose we could add some maxTokenLen bits to some of the schemas just to exercise that (which would have found the LowerCaseTokenizerFactory bit). > Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the > max token length > - > > Key: LUCENE-7705 > URL: https://issues.apache.org/jira/browse/LUCENE-7705 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Amrit Sarkar >Priority: Minor > Attachments: LUCENE-7705.patch > > > SOLR-10186 > [~erickerickson]: Is there a good reason that we hard-code a 256 character > limit for the CharTokenizer? In order to change this limit it requires that > people copy/paste the incrementToken into some new class since incrementToken > is final. > KeywordTokenizer can easily change the default (which is also 256 bytes), but > to do so requires code rather than being able to configure it in the schema. > For KeywordTokenizer, this is Solr-only. For the CharTokenizer classes > (WhitespaceTokenizer, UnicodeWhitespaceTokenizer and LetterTokenizer) > (Factories) it would take adding a c'tor to the base class in Lucene and > using it in the factory. > Any objections? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-master - Build # 1691 - Still Unstable
Build: https://builds.apache.org/job/Lucene-Solr-Tests-master/1691/ 1 tests failed. FAILED: org.apache.solr.cloud.CdcrBootstrapTest.testConvertClusterToCdcrAndBootstrap Error Message: Document mismatch on target after sync expected:<1000> but was:<0> Stack Trace: java.lang.AssertionError: Document mismatch on target after sync expected:<1000> but was:<0> at __randomizedtesting.SeedInfo.seed([1375753A57F29F1F:C4A25A4DE3AD0758]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.solr.cloud.CdcrBootstrapTest.testConvertClusterToCdcrAndBootstrap(CdcrBootstrapTest.java:134) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) at java.lang.Thread.run(Thread.java:745) Build Log: [...truncated 12445 lines...] [junit4] Suite: org.apache.solr.cloud.CdcrBootstrapTest [junit4] 2> Creating
[jira] [Commented] (SOLR-9764) Design a memory efficient DocSet if a query returns all docs
[ https://issues.apache.org/jira/browse/SOLR-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881362#comment-15881362 ] ASF subversion and git services commented on SOLR-9764: --- Commit 92e619260cc89b4725c2e5e971fc3cb7bbb339cc in lucene-solr's branch refs/heads/branch_6x from [~yo...@apache.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=92e6192 ] SOLR-9764: fix CHANGES entry > Design a memory efficient DocSet if a query returns all docs > > > Key: SOLR-9764 > URL: https://issues.apache.org/jira/browse/SOLR-9764 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Michael Sun >Assignee: Yonik Seeley > Fix For: 6.5, master (7.0) > > Attachments: SOLR_9764_no_cloneMe.patch, SOLR-9764.patch, > SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, > SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch > > > In some use cases, particularly use cases with time series data, using > collection alias and partitioning data into multiple small collections using > timestamp, a filter query can match all documents in a collection. Currently > BitDocSet is used which contains a large array of long integers with every > bits set to 1. After querying, the resulted DocSet saved in filter cache is > large and becomes one of the main memory consumers in these use cases. > For example. suppose a Solr setup has 14 collections for data in last 14 > days, each collection with one day of data. A filter query for last one week > data would result in at least six DocSet in filter cache which matches all > documents in six collections respectively. > This is to design a new DocSet that is memory efficient for such a use case. > The new DocSet removes the large array, reduces memory usage and GC pressure > without losing advantage of large filter cache. > In particular, for use cases when using time series data, collection alias > and partition data into multiple small collections using timestamp, the gain > can be large. > For further optimization, it may be helpful to design a DocSet with run > length encoding. Thanks [~mmokhtar] for suggestion. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9764) Design a memory efficient DocSet if a query returns all docs
[ https://issues.apache.org/jira/browse/SOLR-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881361#comment-15881361 ] ASF subversion and git services commented on SOLR-9764: --- Commit 05c17c9a516d8501b2dcce9b5910a3d0b5510bc4 in lucene-solr's branch refs/heads/master from [~yo...@apache.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=05c17c9 ] SOLR-9764: fix CHANGES entry > Design a memory efficient DocSet if a query returns all docs > > > Key: SOLR-9764 > URL: https://issues.apache.org/jira/browse/SOLR-9764 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Michael Sun >Assignee: Yonik Seeley > Fix For: 6.5, master (7.0) > > Attachments: SOLR_9764_no_cloneMe.patch, SOLR-9764.patch, > SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, > SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch > > > In some use cases, particularly use cases with time series data, using > collection alias and partitioning data into multiple small collections using > timestamp, a filter query can match all documents in a collection. Currently > BitDocSet is used which contains a large array of long integers with every > bits set to 1. After querying, the resulted DocSet saved in filter cache is > large and becomes one of the main memory consumers in these use cases. > For example. suppose a Solr setup has 14 collections for data in last 14 > days, each collection with one day of data. A filter query for last one week > data would result in at least six DocSet in filter cache which matches all > documents in six collections respectively. > This is to design a new DocSet that is memory efficient for such a use case. > The new DocSet removes the large array, reduces memory usage and GC pressure > without losing advantage of large filter cache. > In particular, for use cases when using time series data, collection alias > and partition data into multiple small collections using timestamp, the gain > can be large. > For further optimization, it may be helpful to design a DocSet with run > length encoding. Thanks [~mmokhtar] for suggestion. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9764) Design a memory efficient DocSet if a query returns all docs
[ https://issues.apache.org/jira/browse/SOLR-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881330#comment-15881330 ] Yonik Seeley commented on SOLR-9764: Hmmm, yep. I'll fix... > Design a memory efficient DocSet if a query returns all docs > > > Key: SOLR-9764 > URL: https://issues.apache.org/jira/browse/SOLR-9764 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Michael Sun >Assignee: Yonik Seeley > Fix For: 6.5, master (7.0) > > Attachments: SOLR_9764_no_cloneMe.patch, SOLR-9764.patch, > SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, > SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch > > > In some use cases, particularly use cases with time series data, using > collection alias and partitioning data into multiple small collections using > timestamp, a filter query can match all documents in a collection. Currently > BitDocSet is used which contains a large array of long integers with every > bits set to 1. After querying, the resulted DocSet saved in filter cache is > large and becomes one of the main memory consumers in these use cases. > For example. suppose a Solr setup has 14 collections for data in last 14 > days, each collection with one day of data. A filter query for last one week > data would result in at least six DocSet in filter cache which matches all > documents in six collections respectively. > This is to design a new DocSet that is memory efficient for such a use case. > The new DocSet removes the large array, reduces memory usage and GC pressure > without losing advantage of large filter cache. > In particular, for use cases when using time series data, collection alias > and partition data into multiple small collections using timestamp, the gain > can be large. > For further optimization, it may be helpful to design a DocSet with run > length encoding. Thanks [~mmokhtar] for suggestion. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-6.x-Linux (32bit/jdk1.8.0_121) - Build # 2927 - Unstable!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/2927/ Java: 32bit/jdk1.8.0_121 -server -XX:+UseParallelGC 1 tests failed. FAILED: org.apache.solr.update.TestInPlaceUpdatesDistrib.test Error Message: 'sanitycheck' results against client: org.apache.solr.client.solrj.impl.HttpSolrClient@14e7321 (not leader) wrong [docid] for SolrDocument{id=180, id_field_copy_that_does_not_support_in_place_update_s=180, title_s=title180, id_i=180, inplace_updatable_float=101.0, _version_=1560158401798864896, inplace_updatable_int_with_default=666, inplace_updatable_float_with_default=42.0, [docid]=970} expected:<658> but was:<970> Stack Trace: java.lang.AssertionError: 'sanitycheck' results against client: org.apache.solr.client.solrj.impl.HttpSolrClient@14e7321 (not leader) wrong [docid] for SolrDocument{id=180, id_field_copy_that_does_not_support_in_place_update_s=180, title_s=title180, id_i=180, inplace_updatable_float=101.0, _version_=1560158401798864896, inplace_updatable_int_with_default=666, inplace_updatable_float_with_default=42.0, [docid]=970} expected:<658> but was:<970> at __randomizedtesting.SeedInfo.seed([E6685467432DBE18:6E3C6BBDEDD1D3E0]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.apache.solr.update.TestInPlaceUpdatesDistrib.assertDocIdsAndValuesInResults(TestInPlaceUpdatesDistrib.java:442) at org.apache.solr.update.TestInPlaceUpdatesDistrib.assertDocIdsAndValuesAgainstAllClients(TestInPlaceUpdatesDistrib.java:413) at org.apache.solr.update.TestInPlaceUpdatesDistrib.docValuesUpdateTest(TestInPlaceUpdatesDistrib.java:321) at org.apache.solr.update.TestInPlaceUpdatesDistrib.test(TestInPlaceUpdatesDistrib.java:140) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:992) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:967) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at
[JENKINS] Lucene-Solr-Tests-6.x - Build # 749 - Unstable
Build: https://builds.apache.org/job/Lucene-Solr-Tests-6.x/749/ 1 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.security.hadoop.TestSolrCloudWithHadoopAuthPlugin Error Message: Address already in use Stack Trace: java.net.BindException: Address already in use at __randomizedtesting.SeedInfo.seed([D107F67B9D7C17EA]:0) at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:433) at sun.nio.ch.Net.bind(Net.java:425) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.apache.mina.transport.socket.nio.NioSocketAcceptor.open(NioSocketAcceptor.java:252) at org.apache.mina.transport.socket.nio.NioSocketAcceptor.open(NioSocketAcceptor.java:49) at org.apache.mina.core.polling.AbstractPollingIoAcceptor.registerHandles(AbstractPollingIoAcceptor.java:525) at org.apache.mina.core.polling.AbstractPollingIoAcceptor.access$200(AbstractPollingIoAcceptor.java:67) at org.apache.mina.core.polling.AbstractPollingIoAcceptor$Acceptor.run(AbstractPollingIoAcceptor.java:409) at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:65) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Build Log: [...truncated 11778 lines...] [junit4] Suite: org.apache.solr.security.hadoop.TestSolrCloudWithHadoopAuthPlugin [junit4] 2> Creating dataDir: /x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-6.x/solr/build/solr-core/test/J0/temp/solr.security.hadoop.TestSolrCloudWithHadoopAuthPlugin_D107F67B9D7C17EA-001/init-core-data-001 [junit4] 2> 651059 INFO (SUITE-TestSolrCloudWithHadoopAuthPlugin-seed#[D107F67B9D7C17EA]-worker) [] o.a.s.SolrTestCaseJ4 Using TrieFields [junit4] 2> 651061 INFO (SUITE-TestSolrCloudWithHadoopAuthPlugin-seed#[D107F67B9D7C17EA]-worker) [] o.a.s.SolrTestCaseJ4 Randomized ssl (false) and clientAuth (true) via: @org.apache.solr.util.RandomizeSSL(reason=, ssl=NaN, value=NaN, clientAuth=NaN) [junit4] 2> 655245 WARN (SUITE-TestSolrCloudWithHadoopAuthPlugin-seed#[D107F67B9D7C17EA]-worker) [] o.a.d.s.c.DefaultDirectoryService You didn't change the admin password of directory service instance 'DefaultKrbServer'. Please update the admin password as soon as possible to prevent a possible security breach. [junit4] 2> 655959 INFO (SUITE-TestSolrCloudWithHadoopAuthPlugin-seed#[D107F67B9D7C17EA]-worker) [] o.a.s.SolrTestCaseJ4 ###deleteCore [junit4] 2> 655959 INFO (SUITE-TestSolrCloudWithHadoopAuthPlugin-seed#[D107F67B9D7C17EA]-worker) [] o.a.s.SolrTestCaseJ4 --- Done waiting for all SolrIndexSearchers to be released [junit4] 2> 655959 INFO (SUITE-TestSolrCloudWithHadoopAuthPlugin-seed#[D107F67B9D7C17EA]-worker) [] o.a.s.SolrTestCaseJ4 --- Done waiting for tracked resources to be released [junit4] 2> NOTE: test params are: codec=Lucene62, sim=RandomSimilarity(queryNorm=false,coord=crazy): {}, locale=es-AR, timezone=Africa/Casablanca [junit4] 2> NOTE: Linux 3.13.0-85-generic amd64/Oracle Corporation 1.8.0_121 (64-bit)/cpus=4,threads=1,free=215587256,total=531103744 [junit4] 2> NOTE: All tests run in this JVM: [ScriptEngineTest, RuleEngineTest, DocumentAnalysisRequestHandlerTest, ClusterStateTest, XsltUpdateRequestHandlerTest, TestSolr4Spatial, TestSolrConfigHandlerConcurrent, UtilsToolTest, MergeStrategyTest, SubstringBytesRefFilterTest, AssignTest, TestRealTimeGet, TestHighlightDedupGrouping, AnalysisErrorHandlingTest, SignatureUpdateProcessorFactoryTest, SolrCoreMetricManagerTest, TestSolrConfigHandler, TestCollectionAPIs, TestFunctionQuery, TestBulkSchemaAPI, DistributedFacetPivotSmallTest, SolrCoreTest, TestSchemaSimilarityResource, TestRandomRequestDistribution, OverseerCollectionConfigSetProcessorTest, LeaderElectionIntegrationTest, TestPHPSerializedResponseWriter, TestCloudManagedSchema, TestSolrCoreParser, TestImplicitCoreProperties, TestExactSharedStatsCache, TestNumericTerms64, RecoveryAfterSoftCommitTest, TestFilteredDocIdSet, TestSchemalessBufferedUpdates, TestStressRecovery, TestSolrCoreProperties, TestSolrCloudSnapshots, HdfsCollectionsAPIDistributedZkTest, TestNonDefinedSimilarityFactory, AnalysisAfterCoreReloadTest, TestRTimerTree, OutputWriterTest, TestAddFieldRealTimeGet, TestCloudPivotFacet, TestRecoveryHdfs, TestQuerySenderNoQuery, TestConfigSetImmutable, BlockCacheTest, TriLevelCompositeIdRoutingTest, TestIBSimilarityFactory, SSLMigrationTest, TestCorePropertiesReload, SolrRequestParserTest, TestOmitPositions, ReturnFieldsTest,
[jira] [Commented] (SOLR-9764) Design a memory efficient DocSet if a query returns all docs
[ https://issues.apache.org/jira/browse/SOLR-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881245#comment-15881245 ] Steve Rowe commented on SOLR-9764: -- @yonik: The original commit on this issue included the following CHANGES entry: bq. SOLR-9764: All filters that which all documents in the index now share the same memory (DocSet). I think that the "which" in that sentence should instead be "match"? > Design a memory efficient DocSet if a query returns all docs > > > Key: SOLR-9764 > URL: https://issues.apache.org/jira/browse/SOLR-9764 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Michael Sun >Assignee: Yonik Seeley > Fix For: 6.5, master (7.0) > > Attachments: SOLR_9764_no_cloneMe.patch, SOLR-9764.patch, > SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, > SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch > > > In some use cases, particularly use cases with time series data, using > collection alias and partitioning data into multiple small collections using > timestamp, a filter query can match all documents in a collection. Currently > BitDocSet is used which contains a large array of long integers with every > bits set to 1. After querying, the resulted DocSet saved in filter cache is > large and becomes one of the main memory consumers in these use cases. > For example. suppose a Solr setup has 14 collections for data in last 14 > days, each collection with one day of data. A filter query for last one week > data would result in at least six DocSet in filter cache which matches all > documents in six collections respectively. > This is to design a new DocSet that is memory efficient for such a use case. > The new DocSet removes the large array, reduces memory usage and GC pressure > without losing advantage of large filter cache. > In particular, for use cases when using time series data, collection alias > and partition data into multiple small collections using timestamp, the gain > can be large. > For further optimization, it may be helpful to design a DocSet with run > length encoding. Thanks [~mmokhtar] for suggestion. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7705) Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the max token length
[ https://issues.apache.org/jira/browse/LUCENE-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881239#comment-15881239 ] Amrit Sarkar commented on LUCENE-7705: -- I have cooked up a patch in SOLR-10186, and introduced new constructor in CharTokenizer and related Tokenizer factories, which takes _maxCharLen_ and _factory_ as parameters along with it. Kindly provide your feedback and any comments on introducing new constructors in the classes. Thanks. > Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the > max token length > - > > Key: LUCENE-7705 > URL: https://issues.apache.org/jira/browse/LUCENE-7705 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Amrit Sarkar >Priority: Minor > > SOLR-10186 > [~erickerickson]: Is there a good reason that we hard-code a 256 character > limit for the CharTokenizer? In order to change this limit it requires that > people copy/paste the incrementToken into some new class since incrementToken > is final. > KeywordTokenizer can easily change the default (which is also 256 bytes), but > to do so requires code rather than being able to configure it in the schema. > For KeywordTokenizer, this is Solr-only. For the CharTokenizer classes > (WhitespaceTokenizer, UnicodeWhitespaceTokenizer and LetterTokenizer) > (Factories) it would take adding a c'tor to the base class in Lucene and > using it in the factory. > Any objections? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10156) Add significantTerms Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881219#comment-15881219 ] ASF subversion and git services commented on SOLR-10156: Commit 66fb1f83d64f5c79cedd4876e19a541eba30aed1 in lucene-solr's branch refs/heads/branch_6x from [~joel.bernstein] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=66fb1f8 ] SOLR-10156: Increase the overfetch > Add significantTerms Streaming Expression > - > > Key: SOLR-10156 > URL: https://issues.apache.org/jira/browse/SOLR-10156 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5 > > Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch > > > The significantTerms Streaming Expression will emit a set of terms from a > *text field* within a doc frequency range for a specific query. It will also > score the terms based on how many times the terms appear in the result set, > and how many times the terms appear in the corpus, and return the top N terms > based on this significance score. > Syntax: > {code} > significantTerms(collection, >q="abc", >field="some_text_field", >minDocFreq="x", >maxDocFreq="y", >limit="50") > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10156) Add significantTerms Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881218#comment-15881218 ] ASF subversion and git services commented on SOLR-10156: Commit 744fbde1b6d770caafe0d0a4507fea30d08f8152 in lucene-solr's branch refs/heads/branch_6x from [~joel.bernstein] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=744fbde ] SOLR-10156: Add significantTerms Streaming Expression > Add significantTerms Streaming Expression > - > > Key: SOLR-10156 > URL: https://issues.apache.org/jira/browse/SOLR-10156 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5 > > Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch > > > The significantTerms Streaming Expression will emit a set of terms from a > *text field* within a doc frequency range for a specific query. It will also > score the terms based on how many times the terms appear in the result set, > and how many times the terms appear in the corpus, and return the top N terms > based on this significance score. > Syntax: > {code} > significantTerms(collection, >q="abc", >field="some_text_field", >minDocFreq="x", >maxDocFreq="y", >limit="50") > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-7707: Attachment: LUCENE-7707.patch fix typo s/loosing/losing > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch, > LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881181#comment-15881181 ] Michael McCandless commented on LUCENE-7707: +1, thanks [~simonw]. > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch, > LUCENE-7707.patch, LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-7707: Attachment: LUCENE-7707.patch updated javadocs > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch, > LUCENE-7707.patch, LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-10198) EmbeddedSolrServer embedded behavior different from HttpSolrClient
Bert Summers created SOLR-10198: --- Summary: EmbeddedSolrServer embedded behavior different from HttpSolrClient Key: SOLR-10198 URL: https://issues.apache.org/jira/browse/SOLR-10198 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: SolrJ Affects Versions: 6.4.1 Reporter: Bert Summers When retrieving the value of a field the object type is different depending on the server type. If I have a schema which has If I do solrClient.queryAndStreamResponse("test", new SolrQuery("*:*"), new StreamingResponseCallback { @Override public void streamSolrDocument(final SolrDocument doc) { Object idField = doc.getFieldValue("id"); } @Override public void streamDocListInfo(final long numFound, final long start, final Float maxScore) { System.out.println("Found " + numFound + " documents"); } }); in streamSolrDocument the Object type is Integer if the server is http but StoredField if embedded. Both the server and embedded use the same schema.xml and solrconfig.xml In version 5.1.0 both connections would return the same type (Integer) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10156) Add significantTerms Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881145#comment-15881145 ] ASF subversion and git services commented on SOLR-10156: Commit a0aef2faaf7da56efc8ac4b004e9d3b8dc401e81 in lucene-solr's branch refs/heads/master from [~joel.bernstein] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a0aef2f ] SOLR-10156: Increase the overfetch > Add significantTerms Streaming Expression > - > > Key: SOLR-10156 > URL: https://issues.apache.org/jira/browse/SOLR-10156 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5 > > Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch > > > The significantTerms Streaming Expression will emit a set of terms from a > *text field* within a doc frequency range for a specific query. It will also > score the terms based on how many times the terms appear in the result set, > and how many times the terms appear in the corpus, and return the top N terms > based on this significance score. > Syntax: > {code} > significantTerms(collection, >q="abc", >field="some_text_field", >minDocFreq="x", >maxDocFreq="y", >limit="50") > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10156) Add significantTerms Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881144#comment-15881144 ] ASF subversion and git services commented on SOLR-10156: Commit dba733e7aa90bd607fdda0342b94bc17bb717c31 in lucene-solr's branch refs/heads/master from [~joel.bernstein] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=dba733e ] SOLR-10156: Add significantTerms Streaming Expression > Add significantTerms Streaming Expression > - > > Key: SOLR-10156 > URL: https://issues.apache.org/jira/browse/SOLR-10156 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5 > > Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch > > > The significantTerms Streaming Expression will emit a set of terms from a > *text field* within a doc frequency range for a specific query. It will also > score the terms based on how many times the terms appear in the result set, > and how many times the terms appear in the corpus, and return the top N terms > based on this significance score. > Syntax: > {code} > significantTerms(collection, >q="abc", >field="some_text_field", >minDocFreq="x", >maxDocFreq="y", >limit="50") > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881132#comment-15881132 ] Michael McCandless commented on LUCENE-7707: +1, looks awesome! Maybe update the javadocs to explain that we will either fill in the shardIndex or will not but never both. > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch, > LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-7707: Attachment: LUCENE-7707.patch s/where/were > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch, > LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-7707: Attachment: LUCENE-7707.patch here is a new patch adding more safety to it and making the decision up-front if we assign shardIndex or not > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881076#comment-15881076 ] Simon Willnauer commented on LUCENE-7707: - bq. Maybe we could require that either all incoming shardIndex are undefined, or all are set, but you are not allowed to mix? I think this is what we should ultimately do. I don't see a different way than peaking at the at the TopDocs so see if it's preset and then executed based on that. I can certainly add assertions... > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9887) Add KeepWordFilter, StemmerOverrideFilter, StopFilterFactory, SynonymFilter that reads data from a JDBC source
[ https://issues.apache.org/jira/browse/SOLR-9887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881013#comment-15881013 ] Christine Poerschke commented on SOLR-9887: --- (Late to the party here.) I think support for stop words, synonyms, etc. from sources other than text files would be a useful feature for Solr and using streaming expressions to 'abstract away' the source of the stop words sounds like a good generalisation. What might a preferred and suitable approach be to take this forward? In no particular order: * Option 1: config-to-code ** starting with the existing config e.g. {{}} work out and sketch out what the new streaming expressions based configuration will look like ** coding up of that solution * Option 2: build-upon-existing ** creation of a pull request against lucene-solr based upon https://github.com/shopping24/solr-jdbc as per above ** transformation of that pull request into streaming expressions based approach * Option 3: ** (From my very positive and collaborative experiences on SOLR-5730 and SOLR-8621 my preference/recommendation would probably be 'Option 1' rather than 'Option 2' and I'd be very interested to hear what Option 3, 4, etc. might be also.) > Add KeepWordFilter, StemmerOverrideFilter, StopFilterFactory, SynonymFilter > that reads data from a JDBC source > -- > > Key: SOLR-9887 > URL: https://issues.apache.org/jira/browse/SOLR-9887 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tobias Kässmann >Priority: Minor > > We've created some new {{FilterFactories}} that reads their stopwords or > synonyms from a database (by a JDBC source). That enables us a easy > management of large lists and also add the possibility to do this in other > tools. JDBC data sources are retrieved via JNDI. > For a easy reload of this lists we've added a {{SeacherAwareReloader}} > abstraciton that reloads this lists on every new searcher event. > If this is a feature that is interesting for Solr, we will create a pull > request. All the sources are currently available here: > https://github.com/shopping24/solr-jdbc -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection
[ https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881006#comment-15881006 ] Ishan Chattopadhyaya commented on SOLR-10130: - Adding a link to https://issues.apache.org/jira/browse/SOLR-10182 for backing out the changes that caused these perf degradations. > Serious performance degradation in Solr 6.4.1 due to the new metrics > collection > --- > > Key: SOLR-10130 > URL: https://issues.apache.org/jira/browse/SOLR-10130 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Affects Versions: 6.4.1 > Environment: Centos 7, OpenJDK 1.8.0 update 111 >Reporter: Ere Maijala >Assignee: Andrzej Bialecki >Priority: Blocker > Labels: perfomance > Fix For: master (7.0), 6.4.2 > > Attachments: SOLR-10130.patch, SOLR-10130.patch, > solr-8983-console-f1.log > > > We've stumbled on serious performance issues after upgrading to Solr 6.4.1. > Looks like the new metrics collection system in MetricsDirectoryFactory is > causing a major slowdown. This happens with an index configuration that, as > far as I can see, has no metrics specific configuration and uses > luceneMatchVersion 5.5.0. In practice a moderate load will completely bog > down the server with Solr threads constantly using up all CPU (600% on 6 core > machine) capacity with a load that normally where we normally see an average > load of < 50%. > I took stack traces (I'll attach them) and noticed that the threads are > spending time in com.codahale.metrics.Meter.mark. I tested building Solr > 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte > and getBytes methods and was unable to reproduce the issue. > As far as I can see there are several issues: > 1. Collecting metrics on every single byte read is slow. > 2. Having it enabled by default is not a good idea. > 3. The comment "enable coarse-grained metrics by default" at > https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104 > implies that only coarse-grained metrics should be enabled by default, and > this contradicts with collecting metrics on every single byte read. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10155) Clarify logic for term filters on numeric types
[ https://issues.apache.org/jira/browse/SOLR-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880996#comment-15880996 ] Adrien Grand commented on SOLR-10155: - +1 to explicitly reject facet.contains and facet.prefix on numerics with a clear error message > Clarify logic for term filters on numeric types > --- > > Key: SOLR-10155 > URL: https://issues.apache.org/jira/browse/SOLR-10155 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-10155.patch > > > The following code has been found to be confusing to multiple folks working > in SimpleFacets.java (see SOLR-10132) > {code} > if (termFilter != null) { > // TODO: understand this logic... what is the case for > supporting an empty string > // for contains on numeric facets? What does that achieve? > // The exception message is misleading in the case of an > excludeTerms filter in any case... > // Also maybe vulnerable to NPE on isEmpty test? > final boolean supportedOperation = (termFilter instanceof > SubstringBytesRefFilter) && ((SubstringBytesRefFilter) > termFilter).substring().isEmpty(); > if (!supportedOperation) { > throw new SolrException(ErrorCode.BAD_REQUEST, > FacetParams.FACET_CONTAINS + " is not supported on numeric types"); > } > } > {code} > This is found around line 482 or so. The comment in the code above is mine, > and won't be found in the codebase. This ticket can be resolved by > eliminating the complex check and just denying all termFilters with a better > exception message not specific to contains filters (and perhaps consolidated > with the proceeding check for about prefix filters?), or adding a comment to > the code base explaining why we need to allow a term filter with an empty, > non-null string to be processed, and why this isn't an NPE waiting to happen. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-10173) Enable extension/customization of HttpShardHandler by increasing visibility
[ https://issues.apache.org/jira/browse/SOLR-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke resolved SOLR-10173. Resolution: Fixed Fix Version/s: master (7.0) 6.x Thanks Ramsey! > Enable extension/customization of HttpShardHandler by increasing visibility > --- > > Key: SOLR-10173 > URL: https://issues.apache.org/jira/browse/SOLR-10173 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ramsey Haddad >Assignee: Christine Poerschke >Priority: Minor > Fix For: 6.x, master (7.0) > > Attachments: solr-10173.patch, SOLR-10173.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > Increase visibility of 2 elements of HttpShardHandlerFactory from "private" > to "protected" to facilitate extension of the class. Make > ReplicaListTransformer "public" to enable implementation of the interface in > custom classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10155) Clarify logic for term filters on numeric types
[ https://issues.apache.org/jira/browse/SOLR-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880988#comment-15880988 ] Christine Poerschke commented on SOLR-10155: bq. ... whether there's a use case for passing blanks through ... supplying a blank is the means of "turning it off" without blowing up ... That's a fair point, yes, the change in behaviour would have to be documented clearly in the CHANGES.txt e.g. something along the lines of _"facet.contains= is now rejected for numeric types"_. So then, yes, would it make sense to apply the same change to facet.prefix with a joint _"facet.contains= and facet.prefix= are now rejected for numeric types"_ CHANGES.txt note? [~jpountz] - would you have any thoughts on this, following on from the (long time ago) SOLR-3855 commit Gus mentioned above? > Clarify logic for term filters on numeric types > --- > > Key: SOLR-10155 > URL: https://issues.apache.org/jira/browse/SOLR-10155 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck >Priority: Minor > Attachments: SOLR-10155.patch > > > The following code has been found to be confusing to multiple folks working > in SimpleFacets.java (see SOLR-10132) > {code} > if (termFilter != null) { > // TODO: understand this logic... what is the case for > supporting an empty string > // for contains on numeric facets? What does that achieve? > // The exception message is misleading in the case of an > excludeTerms filter in any case... > // Also maybe vulnerable to NPE on isEmpty test? > final boolean supportedOperation = (termFilter instanceof > SubstringBytesRefFilter) && ((SubstringBytesRefFilter) > termFilter).substring().isEmpty(); > if (!supportedOperation) { > throw new SolrException(ErrorCode.BAD_REQUEST, > FacetParams.FACET_CONTAINS + " is not supported on numeric types"); > } > } > {code} > This is found around line 482 or so. The comment in the code above is mine, > and won't be found in the codebase. This ticket can be resolved by > eliminating the complex check and just denying all termFilters with a better > exception message not specific to contains filters (and perhaps consolidated > with the proceeding check for about prefix filters?), or adding a comment to > the code base explaining why we need to allow a term filter with an empty, > non-null string to be processed, and why this isn't an NPE waiting to happen. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-10155) Clarify logic for term filters on numeric types
[ https://issues.apache.org/jira/browse/SOLR-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke reassigned SOLR-10155: -- Assignee: Christine Poerschke > Clarify logic for term filters on numeric types > --- > > Key: SOLR-10155 > URL: https://issues.apache.org/jira/browse/SOLR-10155 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-10155.patch > > > The following code has been found to be confusing to multiple folks working > in SimpleFacets.java (see SOLR-10132) > {code} > if (termFilter != null) { > // TODO: understand this logic... what is the case for > supporting an empty string > // for contains on numeric facets? What does that achieve? > // The exception message is misleading in the case of an > excludeTerms filter in any case... > // Also maybe vulnerable to NPE on isEmpty test? > final boolean supportedOperation = (termFilter instanceof > SubstringBytesRefFilter) && ((SubstringBytesRefFilter) > termFilter).substring().isEmpty(); > if (!supportedOperation) { > throw new SolrException(ErrorCode.BAD_REQUEST, > FacetParams.FACET_CONTAINS + " is not supported on numeric types"); > } > } > {code} > This is found around line 482 or so. The comment in the code above is mine, > and won't be found in the codebase. This ticket can be resolved by > eliminating the complex check and just denying all termFilters with a better > exception message not specific to contains filters (and perhaps consolidated > with the proceeding check for about prefix filters?), or adding a comment to > the code base explaining why we need to allow a term filter with an empty, > non-null string to be processed, and why this isn't an NPE waiting to happen. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880862#comment-15880862 ] Michael McCandless commented on LUCENE-7707: Maybe we could require that either all incoming {{shardIndex}} are undefined, or all are set, but you are not allowed to mix? > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880859#comment-15880859 ] Jim Ferenczi commented on LUCENE-7707: -- bq. Plus I think it's very unlikely someone today is pre-setting the shardIndex (off of it's default -1 value) and then relying on TopDocs.merge Good point. +1 to the patch too, there's nothing to break here ;) > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880854#comment-15880854 ] Adrien Grand commented on LUCENE-7707: -- I don't like the fact that if you mix top docs that have the shard index set and other instances that have it undefined, then we could end up assigning a shard id that is already in use. Is there a way we can avoid that? > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-10156: -- Attachment: SOLR-10156.patch > Add significantTerms Streaming Expression > - > > Key: SOLR-10156 > URL: https://issues.apache.org/jira/browse/SOLR-10156 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5 > > Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch > > > The significantTerms Streaming Expression will emit a set of terms from a > *text field* within a doc frequency range for a specific query. It will also > score the terms based on how many times the terms appear in the result set, > and how many times the terms appear in the corpus, and return the top N terms > based on this significance score. > Syntax: > {code} > significantTerms(collection, >q="abc", >field="some_text_field", >minDocFreq="x", >maxDocFreq="y", >limit="50") > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880838#comment-15880838 ] Michael McCandless commented on LUCENE-7707: +1 to the patch. bq. I personally would like that but Michael McCandless had some issues with this? Yeah, I'd prefer not to add a boolean argument: that's allowing a temporary back compat issue to have a permanent impact on our APIs. Our APIs should be designed for future usage. Plus I think it's very unlikely someone today is pre-setting the shardIndex (off of it's default -1 value) and then relying on TopDocs.merge to overwrite it. I think the patch is sufficient back compat behavior w/o a compromised API change. > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-master - Build # 1690 - Unstable
Build: https://builds.apache.org/job/Lucene-Solr-Tests-master/1690/ 3 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.core.TestLazyCores Error Message: ObjectTracker found 5 object(s) that were not released!!! [MMapDirectory, MMapDirectory, SolrCore, MMapDirectory, MDCAwareThreadPoolExecutor] org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: org.apache.lucene.store.MMapDirectory at org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42) at org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:347) at org.apache.solr.core.MetricsDirectoryFactory.get(MetricsDirectoryFactory.java:208) at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:348) at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:694) at org.apache.solr.core.SolrCore.(SolrCore.java:911) at org.apache.solr.core.SolrCore.(SolrCore.java:828) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:937) at org.apache.solr.core.CoreContainer.lambda$load$3(CoreContainer.java:572) at com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: org.apache.lucene.store.MMapDirectory at org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42) at org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:347) at org.apache.solr.core.MetricsDirectoryFactory.get(MetricsDirectoryFactory.java:208) at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:98) at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:726) at org.apache.solr.core.SolrCore.(SolrCore.java:911) at org.apache.solr.core.SolrCore.(SolrCore.java:828) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:937) at org.apache.solr.core.CoreContainer.lambda$load$3(CoreContainer.java:572) at com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: org.apache.solr.core.SolrCore at org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42) at org.apache.solr.core.SolrCore.(SolrCore.java:1001) at org.apache.solr.core.SolrCore.(SolrCore.java:828) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:937) at org.apache.solr.core.CoreContainer.lambda$load$3(CoreContainer.java:572) at com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: org.apache.lucene.store.MMapDirectory at org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42) at org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:347) at org.apache.solr.core.MetricsDirectoryFactory.get(MetricsDirectoryFactory.java:208) at org.apache.solr.core.SolrCore.initSnapshotMetaDataManager(SolrCore.java:479) at org.apache.solr.core.SolrCore.(SolrCore.java:905) at org.apache.solr.core.SolrCore.(SolrCore.java:828) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:937) at org.apache.solr.core.CoreContainer.lambda$load$3(CoreContainer.java:572) at com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at
[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download
[ https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880794#comment-15880794 ] Uwe Schindler commented on SOLR-9450: - Actually it's much easier: {noformat} solr.javadoc.url=${JOB_URL}javadoc/ {noformat} > Link to online Javadocs instead of distributing with binary download > > > Key: SOLR-9450 > URL: https://issues.apache.org/jira/browse/SOLR-9450 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Assignee: Uwe Schindler > Fix For: 6.5, master (7.0) > > Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, > SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch > > > Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} > in the binary download with a link to the online JavaDocs. The build should > make sure to generate a link to the correct version. I believe this is the > correct tamplate: http://lucene.apache.org/solr/6_2_0/ -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-9450) Link to online Javadocs instead of distributing with binary download
[ https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880593#comment-15880593 ] Uwe Schindler edited comment on SOLR-9450 at 2/23/17 4:25 PM: -- I updated the Jenkins Jobs for artifacts and added: {noformat} solr.javadoc.url=${JENKINS_URL}job/${JOB_NAME}/javadoc/ {noformat} was (Author: thetaphi): I updated the Jenkins Jobs for artifacts and added: {noformat} solr.javadoc.url=${JENKINS_URL}/job/${JOB_NAME}/javadoc/ {noformat} > Link to online Javadocs instead of distributing with binary download > > > Key: SOLR-9450 > URL: https://issues.apache.org/jira/browse/SOLR-9450 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Assignee: Uwe Schindler > Fix For: 6.5, master (7.0) > > Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, > SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch > > > Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} > in the binary download with a link to the online JavaDocs. The build should > make sure to generate a link to the correct version. I believe this is the > correct tamplate: http://lucene.apache.org/solr/6_2_0/ -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880767#comment-15880767 ] Simon Willnauer commented on LUCENE-7707: - I personally think making this solely dependent on a boolean would be best IMO. It would be an additional overload of the methods that explicitly turns on that shardIndex is set on the ScoreDoc and we don't need to do as much conditionals in the tie-breaking. I personally would like that but [~mikemccand] had some issues with this? > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880708#comment-15880708 ] Jim Ferenczi commented on LUCENE-7707: -- +1, this will make the merge more flexible. If we really want to be sure that it does not break the BWC maybe it can be an option of the merge function ? A simple boolean overrideShardIndex with a default value of false ? > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-7707: Attachment: LUCENE-7707.patch here is another iteration that makes sorting stable and shares some code for tiebreaking between sorting and merge by score code > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch, LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-master-Linux (64bit/jdk1.8.0_121) - Build # 19034 - Unstable!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/19034/ Java: 64bit/jdk1.8.0_121 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.solr.security.hadoop.TestDelegationWithHadoopAuth.testDelegationTokenRenew Error Message: expected:<200> but was:<403> Stack Trace: java.lang.AssertionError: expected:<200> but was:<403> at __randomizedtesting.SeedInfo.seed([CB220527BCDC9F4B:FCB9F139841042EF]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.security.hadoop.TestDelegationWithHadoopAuth.renewDelegationToken(TestDelegationWithHadoopAuth.java:118) at org.apache.solr.security.hadoop.TestDelegationWithHadoopAuth.verifyDelegationTokenRenew(TestDelegationWithHadoopAuth.java:302) at org.apache.solr.security.hadoop.TestDelegationWithHadoopAuth.testDelegationTokenRenew(TestDelegationWithHadoopAuth.java:319) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at
[jira] [Commented] (SOLR-10092) HDFS: AutoAddReplica fails
[ https://issues.apache.org/jira/browse/SOLR-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880651#comment-15880651 ] Hendrik Haddorp commented on SOLR-10092: Sorry for the spam looks like I tested my patch wrong last time. Solr 6.3 on HDFS with legacyMode=false fails with the stated exception. But just using my patch does not fix that. The exception is gone but then I get: org.apache.solr.common.SolrException: coreNodeName core_node1 exists, but does not match expected node or core name: DocCollection(test.test3//collections/test.test3/state.json/50)={ } at org.apache.solr.cloud.ZkController.checkStateInZk(ZkController.java:1562) at org.apache.solr.cloud.ZkController.preRegister(ZkController.java:1488) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:837) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:779) > HDFS: AutoAddReplica fails > -- > > Key: SOLR-10092 > URL: https://issues.apache.org/jira/browse/SOLR-10092 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: hdfs >Affects Versions: 6.3 >Reporter: Hendrik Haddorp > Attachments: SOLR-10092.patch > > > OverseerAutoReplicaFailoverThread fails to create replacement core with this > exception: > o.a.s.c.OverseerAutoReplicaFailoverThread Exception trying to create new > replica on > http://...:9000/solr:org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: > Error from server at http://...:9000/solr: Error CREATEing SolrCore > 'test2.collection-09_shard1_replica1': Unable to create core > [test2.collection-09_shard1_replica1] Caused by: No shard id for > CoreDescriptor[name=test2.collection-09_shard1_replica1;instanceDir=/var/opt/solr/test2.collection-09_shard1_replica1] > at > org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:593) > at > org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:262) > at > org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:251) > at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219) > at > org.apache.solr.cloud.OverseerAutoReplicaFailoverThread.createSolrCore(OverseerAutoReplicaFailoverThread.java:456) > at > org.apache.solr.cloud.OverseerAutoReplicaFailoverThread.lambda$addReplica$0(OverseerAutoReplicaFailoverThread.java:251) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > also see this mail thread about the issue: > https://lists.apache.org/thread.html/%3CCAA70BoWyzbvQuJTyzaG4Kx1tj0Djgcm+MV=x_hoac1e6cse...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10079) TestInPlaceUpdatesDistrib failure
[ https://issues.apache.org/jira/browse/SOLR-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880602#comment-15880602 ] Steve Rowe commented on SOLR-10079: --- My Jenkins found a reproducing branch_6x seed: {noformat} [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestInPlaceUpdatesDistrib -Dtests.method=test -Dtests.seed=B88AA94CC5E07DDA -Dtests.slow=true -Dtests.locale=en-GB -Dtests.timezone=Africa/Tripoli -Dtests.asserts=true -Dtests.file.encoding=UTF-8 [junit4] FAILURE 40.7s J1 | TestInPlaceUpdatesDistrib.test <<< [junit4]> Throwable #1: java.lang.AssertionError: 'sanitycheck' results against client: org.apache.solr.client.solrj.impl.HttpSolrClient@3cc2aada (not leader) wrong [docid] for SolrDocument{id=10, id_field_copy_that_does_not_support_in_place_update_s=10, title_s=title10, id_i=10, inplace_updatable_float=101.0, _version_=1560081900526108672, inplace_updatable_int_with_default=666, inplace_updatable_float_with_default=42.0, [docid]=322} expected:<306> but was:<322> [junit4]>at __randomizedtesting.SeedInfo.seed([B88AA94CC5E07DDA:30DE96966B1C1022]:0) [junit4]>at org.apache.solr.update.TestInPlaceUpdatesDistrib.assertDocIdsAndValuesInResults(TestInPlaceUpdatesDistrib.java:442) [junit4]>at org.apache.solr.update.TestInPlaceUpdatesDistrib.assertDocIdsAndValuesAgainstAllClients(TestInPlaceUpdatesDistrib.java:413) [junit4]>at org.apache.solr.update.TestInPlaceUpdatesDistrib.docValuesUpdateTest(TestInPlaceUpdatesDistrib.java:321) [junit4]>at org.apache.solr.update.TestInPlaceUpdatesDistrib.test(TestInPlaceUpdatesDistrib.java:140) [junit4]>at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:992) [junit4]>at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:967) [...] [junit4] 2> NOTE: test params are: codec=Asserting(Lucene62): {title_s=PostingsFormat(name=Memory doPackFST= false), id=PostingsFormat(name=Direct), id_field_copy_that_does_not_support_in_place_update_s=TestBloomFilteredLucenePostings(BloomFilteringPostingsFormat(Lucene50(blocksize=128)))}, docValues:{inplace_updatable_float=DocValuesFormat(name=Direct), id_i=DocValuesFormat(name=Memory), _version_=DocValuesFormat(name=Lucene54), title_s=DocValuesFormat(name=Direct), id=DocValuesFormat(name=Lucene54), id_field_copy_that_does_not_support_in_place_update_s=DocValuesFormat(name=Lucene54), inplace_updatable_int_with_default=DocValuesFormat(name=Direct), inplace_updatable_float_with_default=DocValuesFormat(name=Memory)}, maxPointsInLeafNode=880, maxMBSortInHeap=5.97801431291, sim=RandomSimilarity(queryNorm=true,coord=no): {}, locale=en-GB, timezone=Africa/Tripoli [junit4] 2> NOTE: Linux 4.1.0-custom2-amd64 amd64/Oracle Corporation 1.8.0_77 (64-bit)/cpus=16,threads=1,free=210993488,total=526909440 {noformat} > TestInPlaceUpdatesDistrib failure > - > > Key: SOLR-10079 > URL: https://issues.apache.org/jira/browse/SOLR-10079 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Steve Rowe >Assignee: Ishan Chattopadhyaya > Attachments: SOLR-10079.patch, stdout > > > From [https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18881/], > reproduces for me: > {noformat} > Checking out Revision d8d61ff61d1d798f5e3853ef66bc485d0d403f18 > (refs/remotes/origin/master) > [...] >[junit4] 2> NOTE: reproduce with: ant test > -Dtestcase=TestInPlaceUpdatesDistrib -Dtests.method=test > -Dtests.seed=E1BB56269B8215B0 -Dtests.multiplier=3 -Dtests.slow=true > -Dtests.locale=sr-Latn-RS -Dtests.timezone=America/Grand_Turk > -Dtests.asserts=true -Dtests.file.encoding=UTF-8 >[junit4] FAILURE 77.7s J2 | TestInPlaceUpdatesDistrib.test <<< >[junit4]> Throwable #1: java.lang.AssertionError: Earlier: [79, 79, > 79], now: [78, 78, 78] >[junit4]> at > __randomizedtesting.SeedInfo.seed([E1BB56269B8215B0:69EF69FC357E7848]:0) >[junit4]> at > org.apache.solr.update.TestInPlaceUpdatesDistrib.ensureRtgWorksWithPartialUpdatesTest(TestInPlaceUpdatesDistrib.java:425) >[junit4]> at > org.apache.solr.update.TestInPlaceUpdatesDistrib.test(TestInPlaceUpdatesDistrib.java:142) >[junit4]> at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >[junit4]> at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >[junit4]> at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >
[jira] [Closed] (SOLR-7764) Solr indexing hangs if encounters an certain XML parse error
[ https://issues.apache.org/jira/browse/SOLR-7764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson closed SOLR-7764. Resolution: Invalid this is a Tika issue not a Solr one, please continue the discussion with the Tika project. > Solr indexing hangs if encounters an certain XML parse error > > > Key: SOLR-7764 > URL: https://issues.apache.org/jira/browse/SOLR-7764 > Project: Solr > Issue Type: Bug > Components: query parsers >Affects Versions: 4.7.2 > Environment: Ubuntu 12.04.5 LTS >Reporter: Sorin Gheorghiu > Labels: bluespice, indexing > Attachments: Solr_XML_parse_error_080715.txt > > > BlueSpice (http://bluespice.com/) uses Solr to index documents for the > 'Extended search' feature. > Solr hangs if during indexing certain error occurs: > 8.7.2015 15:34:26 > ERROR > SolrCore > org.apache.solr.common.SolrException: > org.apache.tika.exception.TikaException: XML parse error > 8.7.2015 15:34:26 > ERROR > SolrDispatchFilter > null:org.apache.solr.common.SolrException: > org.apache.tika.exception.TikaException: XML parse error -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download
[ https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880593#comment-15880593 ] Uwe Schindler commented on SOLR-9450: - I updated the Jenkins Jobs for artifacts and added: {noformat} solr.javadoc.url=${JENKINS_URL}/job/${JOB_NAME}/javadoc/ {noformat} > Link to online Javadocs instead of distributing with binary download > > > Key: SOLR-9450 > URL: https://issues.apache.org/jira/browse/SOLR-9450 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Assignee: Uwe Schindler > Fix For: 6.5, master (7.0) > > Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, > SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch > > > Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} > in the binary download with a link to the online JavaDocs. The build should > make sure to generate a link to the correct version. I believe this is the > correct tamplate: http://lucene.apache.org/solr/6_2_0/ -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10197) SolrException during indexing
[ https://issues.apache.org/jira/browse/SOLR-10197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880590#comment-15880590 ] Alexandre Rafalovitch commented on SOLR-10197: -- Please note that these kinds of questions are best to be asked on the Solr Users mailing list as they are usually a configuration issue. JIRA is used to report errors against Lucene/Solr itself. Also, it is important to provide a Solr product version. However, from the quick look at the exception, it seems that you have a MoreLikeThis component activated that has a numeric field configured as part of its similarity field list. When the search term is textual (and not numeric) and Solr tries to expand the query against the numeric field, this causes an exception. I would check the specific query issued to Solr, look at the definition of the request handler it is issued against (in solrconfig.xml) and check the MLT field list configuration and the types of those fields. > SolrException during indexing > - > > Key: SOLR-10197 > URL: https://issues.apache.org/jira/browse/SOLR-10197 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Server >Affects Versions: 4.5 > Environment: Ubuntu 14.04.5 LTS >Reporter: Sorin Gheorghiu > Labels: bluespice, indexing > Attachments: BS_Solr_error_invalid_no.txt > > > BlueSpice (http://bluespice.com/) uses Solr to index documents for the > 'Extended search' feature. Solr hangs consistently during indexing and an > error occurs (see attached). > In the ExtendedSearch.log there is no error, but the latest indexed > document/wiki page: > 22.02.2017 17:45:11 > Zu indexierende Artikel: 4205 > 1: Indexiere Wiki Seiten: 1% - WUI netz.xls > 2: Indexiere Wiki Seiten: 1% - IndividArbanw.pdf > ... > 3526: Indexiere Wiki Seiten: 84% - 2007 > 3527: Indexiere Wiki Seiten: 84% - Buchdurchlaufzeit > 3528: Indexiere Wiki Seiten: 84% - Mahnroutinen > 3529: Indexiere Wiki Seiten: 84% - Software für Informationskompetenz > Could you provide any indication of the error? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
[ https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-7707: Attachment: LUCENE-7707.patch here is a patch > Only assign ScoreDoc#shardIndex if it was already assigned to non default > (-1) value > > > Key: LUCENE-7707 > URL: https://issues.apache.org/jira/browse/LUCENE-7707 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7707.patch > > > When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex > value. The assumption that is made here is that all shard results are merges > at once which is not necessarily the case. If for instance incremental merge > phases are applied the shard index doesn't correspond to the index in the > outer TopDocs array. To make this a backwards compatible but yet > non-controversial change we could change the internals of TopDocs#merge to > only assign this value unless it's not been assigned before to a non-default > (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value
Simon Willnauer created LUCENE-7707: --- Summary: Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value Key: LUCENE-7707 URL: https://issues.apache.org/jira/browse/LUCENE-7707 Project: Lucene - Core Issue Type: Improvement Reporter: Simon Willnauer Fix For: master (7.0), 6.5.0 When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex value. The assumption that is made here is that all shard results are merges at once which is not necessarily the case. If for instance incremental merge phases are applied the shard index doesn't correspond to the index in the outer TopDocs array. To make this a backwards compatible but yet non-controversial change we could change the internals of TopDocs#merge to only assign this value unless it's not been assigned before to a non-default (-1) value to allow multiple or sparse top docs merging. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download
[ https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880568#comment-15880568 ] ASF subversion and git services commented on SOLR-9450: --- Commit 9ecc1ec79db7ed2b7f8f7bb4ce6cf93d2ce3c382 in lucene-solr's branch refs/heads/branch_6x from [~thetaphi] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9ecc1ec ] SOLR-9450: The docs/ folder in the binary distribution now contains a single index.html file linking to the online documentation, reducing the size of the download > Link to online Javadocs instead of distributing with binary download > > > Key: SOLR-9450 > URL: https://issues.apache.org/jira/browse/SOLR-9450 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Assignee: Uwe Schindler > Fix For: 6.5, master (7.0) > > Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, > SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch > > > Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} > in the binary download with a link to the online JavaDocs. The build should > make sure to generate a link to the correct version. I believe this is the > correct tamplate: http://lucene.apache.org/solr/6_2_0/ -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-9450) Link to online Javadocs instead of distributing with binary download
[ https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved SOLR-9450. - Resolution: Fixed > Link to online Javadocs instead of distributing with binary download > > > Key: SOLR-9450 > URL: https://issues.apache.org/jira/browse/SOLR-9450 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Assignee: Uwe Schindler > Fix For: 6.5, master (7.0) > > Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, > SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch > > > Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} > in the binary download with a link to the online JavaDocs. The build should > make sure to generate a link to the correct version. I believe this is the > correct tamplate: http://lucene.apache.org/solr/6_2_0/ -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download
[ https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880566#comment-15880566 ] ASF subversion and git services commented on SOLR-9450: --- Commit 894a43b259a72a82f07649b0d93ab3c17c4d89c4 in lucene-solr's branch refs/heads/master from [~thetaphi] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=894a43b ] SOLR-9450: The docs/ folder in the binary distribution now contains a single index.html file linking to the online documentation, reducing the size of the download > Link to online Javadocs instead of distributing with binary download > > > Key: SOLR-9450 > URL: https://issues.apache.org/jira/browse/SOLR-9450 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Assignee: Uwe Schindler > Fix For: 6.5, master (7.0) > > Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, > SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch > > > Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} > in the binary download with a link to the online JavaDocs. The build should > make sure to generate a link to the correct version. I believe this is the > correct tamplate: http://lucene.apache.org/solr/6_2_0/ -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10076) Hiding keystore and truststore passwords from /admin/info/* outputs
[ https://issues.apache.org/jira/browse/SOLR-10076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880512#comment-15880512 ] Mano Kovacs commented on SOLR-10076: Thank you for the feedback, [~ichattopadhyaya]. Do you think the redaction of command line password could be handled as the first patch contains? > Hiding keystore and truststore passwords from /admin/info/* outputs > --- > > Key: SOLR-10076 > URL: https://issues.apache.org/jira/browse/SOLR-10076 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mano Kovacs > Attachments: SOLR-10076.patch > > > Passing keystore and truststore password is done by system properties, via > cmd line parameter. > As result, {{/admin/info/properties}} and {{/admin/info/system}} will print > out the received password. > Proposing solution to automatically redact value of any system property > before output, containing the word {{password}}, and replacing its value with > {{**}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9527) Solr RESTORE api doesn't distribute the replicas uniformly
[ https://issues.apache.org/jira/browse/SOLR-9527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880507#comment-15880507 ] Dewald Viljoen commented on SOLR-9527: -- I've run smack dab straight into this issue recently. I was wondering what the progress is on this patch? I'm currently running Solr 6.4.1 and would really like to take advantage of the Collections Backup/Restore functionality in combination with HDFS. All works well until I restore the collection and all my shards end up on one of my SolrCloud nodes. I can specify a replicationFactor of 2 and then though some other API calls make the replica's the leaders and rebalance everything but it's a bit of a mess. I'm happy to lend my efforts to get this issue resolved. > Solr RESTORE api doesn't distribute the replicas uniformly > --- > > Key: SOLR-9527 > URL: https://issues.apache.org/jira/browse/SOLR-9527 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.1 >Reporter: Hrishikesh Gadre > Attachments: SOLR-9527.patch, SOLR-9527.patch, SOLR-9527.patch > > > Please refer to this email thread for details, > http://lucene.markmail.org/message/ycun4x5nx7lwj5sk?q=solr+list:org%2Eapache%2Elucene%2Esolr-user+order:date-backward=1 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download
[ https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880505#comment-15880505 ] Uwe Schindler commented on SOLR-9450: - So I think I start with committing the current patch and we can work on improving the documentation. > Link to online Javadocs instead of distributing with binary download > > > Key: SOLR-9450 > URL: https://issues.apache.org/jira/browse/SOLR-9450 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Assignee: Uwe Schindler > Fix For: 6.5, master (7.0) > > Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, > SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch > > > Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} > in the binary download with a link to the online JavaDocs. The build should > make sure to generate a link to the correct version. I believe this is the > correct tamplate: http://lucene.apache.org/solr/6_2_0/ -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download
[ https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880503#comment-15880503 ] Uwe Schindler commented on SOLR-9450: - Ah OK, cool. > Link to online Javadocs instead of distributing with binary download > > > Key: SOLR-9450 > URL: https://issues.apache.org/jira/browse/SOLR-9450 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Assignee: Uwe Schindler > Fix For: 6.5, master (7.0) > > Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, > SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch > > > Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} > in the binary download with a link to the online JavaDocs. The build should > make sure to generate a link to the correct version. I believe this is the > correct tamplate: http://lucene.apache.org/solr/6_2_0/ -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-7419) performance bug in tokenstream.end()
[ https://issues.apache.org/jira/browse/LUCENE-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-7419. Resolution: Fixed OK I backported to 5.5.x branch so next release we do here will have it. > performance bug in tokenstream.end() > > > Key: LUCENE-7419 > URL: https://issues.apache.org/jira/browse/LUCENE-7419 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Priority: Blocker > Fix For: master (7.0), 5.5.5, 6.2 > > Attachments: LUCENE-7419.patch > > > TokenStream.end() calls getAttribute(), which is pretty costly to do > per-stream. > It does its current hack, because in the ctor of TokenStream is "too early". > Instead, we can just add a variant of clear(), called end() to AttributeImpl. > For most attributes it defers to clear, but for PosIncAtt it can handle the > special case. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7419) performance bug in tokenstream.end()
[ https://issues.apache.org/jira/browse/LUCENE-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880500#comment-15880500 ] ASF subversion and git services commented on LUCENE-7419: - Commit 4dbaed52a0a721b2b9668ee8074da42585fd54ea in lucene-solr's branch refs/heads/branch_5_5 from [~rcmuir] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4dbaed5 ] LUCENE-7419: Don't lookup PositionIncrementAttribute every time in TokenStream.end() > performance bug in tokenstream.end() > > > Key: LUCENE-7419 > URL: https://issues.apache.org/jira/browse/LUCENE-7419 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Priority: Blocker > Fix For: master (7.0), 6.2, 5.5.5 > > Attachments: LUCENE-7419.patch > > > TokenStream.end() calls getAttribute(), which is pretty costly to do > per-stream. > It does its current hack, because in the ctor of TokenStream is "too early". > Instead, we can just add a variant of clear(), called end() to AttributeImpl. > For most attributes it defers to clear, but for PosIncAtt it can handle the > special case. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7704) SysnonymGraphFilter doesn't respect ignoreCase parameter
[ https://issues.apache.org/jira/browse/LUCENE-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880486#comment-15880486 ] Sebastian Yonekura Baeza commented on LUCENE-7704: -- Oh, sorry I missed those docs, given that it was a deprecated class I didn't pay much attention to it. Indeed, without the javadocs the parameter {{ignoreCase}} was kind of misleading. Thank you [~mikemccand] for the clarification! > SysnonymGraphFilter doesn't respect ignoreCase parameter > > > Key: LUCENE-7704 > URL: https://issues.apache.org/jira/browse/LUCENE-7704 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: 6.4.1 >Reporter: Sebastian Yonekura Baeza >Priority: Minor > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7704.patch > > > Hi, it seems that SynonymGraphFilter doesn't respect ignoreCase parameter. In > particular this test doesn't pass: > {code:title=UppercaseSynonymMapTest.java|borderStyle=solid} > package com.mapcity.suggest.lucene; > import org.apache.lucene.analysis.Analyzer; > import org.apache.lucene.analysis.TokenStream; > import org.apache.lucene.analysis.Tokenizer; > import org.apache.lucene.analysis.core.WhitespaceTokenizer; > import org.apache.lucene.analysis.synonym.SynonymGraphFilter; > import org.apache.lucene.analysis.synonym.SynonymMap; > import org.apache.lucene.util.CharsRef; > import org.apache.lucene.util.CharsRefBuilder; > import org.junit.Test; > import java.io.IOException; > import static > org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents; > /** > * @author Sebastian Yonekura > * Created on 22-02-17 > */ > public class UppercaseSynonymMapTest { > @Test > public void analyzerTest01() throws IOException { > // This passes > testAssertMapping("word", "synonym"); > // this one not > testAssertMapping("word".toUpperCase(), "synonym"); > } > private void testAssertMapping(String inputString, String outputString) > throws IOException { > SynonymMap.Builder builder = new SynonymMap.Builder(false); > CharsRef input = SynonymMap.Builder.join(inputString.split(" "), new > CharsRefBuilder()); > CharsRef output = SynonymMap.Builder.join(outputString.split(" "), > new CharsRefBuilder()); > builder.add(input, output, true); > Analyzer analyzer = new CustomAnalyzer(builder.build()); > TokenStream tokenStream = analyzer.tokenStream("field", inputString); > assertTokenStreamContents(tokenStream, new String[]{ > outputString, inputString > }); > } > static class CustomAnalyzer extends Analyzer { > private SynonymMap synonymMap; > CustomAnalyzer(SynonymMap synonymMap) { > this.synonymMap = synonymMap; > } > @Override > protected TokenStreamComponents createComponents(String s) { > Tokenizer tokenizer = new WhitespaceTokenizer(); > TokenStream tokenStream = new SynonymGraphFilter(tokenizer, > synonymMap, true); // Ignore case True > return new TokenStreamComponents(tokenizer, tokenStream); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download
[ https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880477#comment-15880477 ] Steve Rowe commented on SOLR-9450: -- {quote} bq. Update website quickstart.mdtext to suggest indexing something else than local javadocs There is also a quickstart.mdext in the checkout's site directory. We should change it there, too. {quote} FYI, there is an Ant target to convert the distribution doc version into the website version - see the last bullet in item #1 here: [https://wiki.apache.org/lucene-java/ReleaseTodo#Update_the_rest_of_the_website]. > Link to online Javadocs instead of distributing with binary download > > > Key: SOLR-9450 > URL: https://issues.apache.org/jira/browse/SOLR-9450 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Assignee: Uwe Schindler > Fix For: 6.5, master (7.0) > > Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, > SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch > > > Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} > in the binary download with a link to the online JavaDocs. The build should > make sure to generate a link to the correct version. I believe this is the > correct tamplate: http://lucene.apache.org/solr/6_2_0/ -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-7704) SysnonymGraphFilter doesn't respect ignoreCase parameter
[ https://issues.apache.org/jira/browse/LUCENE-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-7704. Resolution: Fixed Fix Version/s: 6.5.0 master (7.0) > SysnonymGraphFilter doesn't respect ignoreCase parameter > > > Key: LUCENE-7704 > URL: https://issues.apache.org/jira/browse/LUCENE-7704 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: 6.4.1 >Reporter: Sebastian Yonekura Baeza >Priority: Minor > Fix For: master (7.0), 6.5.0 > > Attachments: LUCENE-7704.patch > > > Hi, it seems that SynonymGraphFilter doesn't respect ignoreCase parameter. In > particular this test doesn't pass: > {code:title=UppercaseSynonymMapTest.java|borderStyle=solid} > package com.mapcity.suggest.lucene; > import org.apache.lucene.analysis.Analyzer; > import org.apache.lucene.analysis.TokenStream; > import org.apache.lucene.analysis.Tokenizer; > import org.apache.lucene.analysis.core.WhitespaceTokenizer; > import org.apache.lucene.analysis.synonym.SynonymGraphFilter; > import org.apache.lucene.analysis.synonym.SynonymMap; > import org.apache.lucene.util.CharsRef; > import org.apache.lucene.util.CharsRefBuilder; > import org.junit.Test; > import java.io.IOException; > import static > org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents; > /** > * @author Sebastian Yonekura > * Created on 22-02-17 > */ > public class UppercaseSynonymMapTest { > @Test > public void analyzerTest01() throws IOException { > // This passes > testAssertMapping("word", "synonym"); > // this one not > testAssertMapping("word".toUpperCase(), "synonym"); > } > private void testAssertMapping(String inputString, String outputString) > throws IOException { > SynonymMap.Builder builder = new SynonymMap.Builder(false); > CharsRef input = SynonymMap.Builder.join(inputString.split(" "), new > CharsRefBuilder()); > CharsRef output = SynonymMap.Builder.join(outputString.split(" "), > new CharsRefBuilder()); > builder.add(input, output, true); > Analyzer analyzer = new CustomAnalyzer(builder.build()); > TokenStream tokenStream = analyzer.tokenStream("field", inputString); > assertTokenStreamContents(tokenStream, new String[]{ > outputString, inputString > }); > } > static class CustomAnalyzer extends Analyzer { > private SynonymMap synonymMap; > CustomAnalyzer(SynonymMap synonymMap) { > this.synonymMap = synonymMap; > } > @Override > protected TokenStreamComponents createComponents(String s) { > Tokenizer tokenizer = new WhitespaceTokenizer(); > TokenStream tokenStream = new SynonymGraphFilter(tokenizer, > synonymMap, true); // Ignore case True > return new TokenStreamComponents(tokenizer, tokenStream); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org