[jira] [Commented] (SOLR-9530) Add an Atomic Update Processor

2017-02-23 Thread Hamso (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882167#comment-15882167
 ] 

Hamso commented on SOLR-9530:
-

I have a question.
What is if we have:

doc1 
{noformat}
{id: 1, Street: xyz}
{noformat}

and doc2
{noformat}
{id: 1, firstName: Tom, lastName: Cruise}
{id: 1, firstName: Max, lastName: Mueller}
{noformat}

How will the final doc look like?

> Add an Atomic Update Processor 
> ---
>
> Key: SOLR-9530
> URL: https://issues.apache.org/jira/browse/SOLR-9530
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
> Attachments: SOLR-9530.patch, SOLR-9530.patch, SOLR-9530.patch
>
>
> I'd like to explore the idea of adding a new update processor to help ingest 
> partial updates.
> Example use-case - There are two datasets with a common id field. How can I 
> merge both of them at index time?
> Proposed Solution: 
> {code}
> 
>   
> add
>   
>   
>   
> 
> {code}
> So the first JSON dump could be ingested against 
> {{http://localhost:8983/solr/gettingstarted/update/json}}
> And then the second JSON could be ingested against
> {{http://localhost:8983/solr/gettingstarted/update/json?processor=atomic}}
> The Atomic Update Processor could support all the atomic update operations 
> currently supported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882136#comment-15882136
 ] 

Simon Willnauer commented on LUCENE-7707:
-

[~jpountz] are your concerns addressed with the latest patch?

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch, 
> LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9530) Add an Atomic Update Processor

2017-02-23 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882079#comment-15882079
 ] 

Noble Paul commented on SOLR-9530:
--

I'm not sure if the second request must fail

imagine a doc already exists with 
{code}
{id: 1}
{code}

subsequently the user sends 2 parallel requests

{code}
{id:1, firstName: Tom}
{id:1, lastName: Cruise}
{code}
 
after these two operations are performed the final doc should be

{code}
{id:1, firstName:Tom, lastName:Cruise}
{code}

The system should handle race conditions gracefully

The URP can fetch the {{\_version_}} before sending the appropriate atomic 
operation  using optimistic concurrency. if the request fails , it can reload 
the {{\_version_}} and retry

> Add an Atomic Update Processor 
> ---
>
> Key: SOLR-9530
> URL: https://issues.apache.org/jira/browse/SOLR-9530
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
> Attachments: SOLR-9530.patch, SOLR-9530.patch, SOLR-9530.patch
>
>
> I'd like to explore the idea of adding a new update processor to help ingest 
> partial updates.
> Example use-case - There are two datasets with a common id field. How can I 
> merge both of them at index time?
> Proposed Solution: 
> {code}
> 
>   
> add
>   
>   
>   
> 
> {code}
> So the first JSON dump could be ingested against 
> {{http://localhost:8983/solr/gettingstarted/update/json}}
> And then the second JSON could be ingested against
> {{http://localhost:8983/solr/gettingstarted/update/json?processor=atomic}}
> The Atomic Update Processor could support all the atomic update operations 
> currently supported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper

2017-02-23 Thread Varun Rajput (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882054#comment-15882054
 ] 

Varun Rajput commented on SOLR-6736:


I like the approach of "trusted" configsets, how would the restrictions on 
vulnerable components be imposed?

> A collections-like request handler to manage solr configurations on zookeeper
> -
>
> Key: SOLR-6736
> URL: https://issues.apache.org/jira/browse/SOLR-6736
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Varun Rajput
>Assignee: Ishan Chattopadhyaya
> Attachments: newzkconf.zip, SOLR-6736-newapi.patch, 
> SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, 
> test_pub.der, zkconfighandler.zip, zkconfighandler.zip
>
>
> Managing Solr configuration files on zookeeper becomes cumbersome while using 
> solr in cloud mode, especially while trying out changes in the 
> configurations. 
> It will be great if there is a request handler that can provide an API to 
> manage the configurations similar to the collections handler that would allow 
> actions like uploading new configurations, linking them to a collection, 
> deleting configurations, etc.
> example : 
> {code}
> #use the following command to upload a new configset called mynewconf. This 
> will fail if there is alredy a conf called 'mynewconf'. The file could be a 
> jar , zip or a tar file which contains all the files for the this conf.
> curl -X POST -H 'Content-Type: application/octet-stream' --data-binary 
> @testconf.zip 
> http://localhost:8983/solr/admin/configs/mynewconf?sig=
> {code}
> A GET to http://localhost:8983/solr/admin/configs will give a list of configs 
> available
> A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the 
> list of files in mynewconf



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper

2017-02-23 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881970#comment-15881970
 ] 

Ishan Chattopadhyaya edited comment on SOLR-6736 at 2/24/17 5:08 AM:
-

bq. So this would not affect the setups which have security enabled. 
Right.
bq. If this endpoint is secured using authorization and authentication, then we 
can store the uploaded configsets with "trusted=true".
Those "trusted" configsets can be used to create collections without any 
restrictions.



was (Author: ichattopadhyaya):
Right.
bq. If this endpoint is secured using authorization and authentication, then we 
can store the uploaded configsets with "trusted=true".
Those "trusted" configsets can be used to create collections without any 
restrictions.


> A collections-like request handler to manage solr configurations on zookeeper
> -
>
> Key: SOLR-6736
> URL: https://issues.apache.org/jira/browse/SOLR-6736
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Varun Rajput
>Assignee: Ishan Chattopadhyaya
> Attachments: newzkconf.zip, SOLR-6736-newapi.patch, 
> SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, 
> test_pub.der, zkconfighandler.zip, zkconfighandler.zip
>
>
> Managing Solr configuration files on zookeeper becomes cumbersome while using 
> solr in cloud mode, especially while trying out changes in the 
> configurations. 
> It will be great if there is a request handler that can provide an API to 
> manage the configurations similar to the collections handler that would allow 
> actions like uploading new configurations, linking them to a collection, 
> deleting configurations, etc.
> example : 
> {code}
> #use the following command to upload a new configset called mynewconf. This 
> will fail if there is alredy a conf called 'mynewconf'. The file could be a 
> jar , zip or a tar file which contains all the files for the this conf.
> curl -X POST -H 'Content-Type: application/octet-stream' --data-binary 
> @testconf.zip 
> http://localhost:8983/solr/admin/configs/mynewconf?sig=
> {code}
> A GET to http://localhost:8983/solr/admin/configs will give a list of configs 
> available
> A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the 
> list of files in mynewconf



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper

2017-02-23 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881970#comment-15881970
 ] 

Ishan Chattopadhyaya commented on SOLR-6736:


Right.
bq. If this endpoint is secured using authorization and authentication, then we 
can store the uploaded configsets with "trusted=true".
Those "trusted" configsets can be used to create collections without any 
restrictions.


> A collections-like request handler to manage solr configurations on zookeeper
> -
>
> Key: SOLR-6736
> URL: https://issues.apache.org/jira/browse/SOLR-6736
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Varun Rajput
>Assignee: Ishan Chattopadhyaya
> Attachments: newzkconf.zip, SOLR-6736-newapi.patch, 
> SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, 
> test_pub.der, zkconfighandler.zip, zkconfighandler.zip
>
>
> Managing Solr configuration files on zookeeper becomes cumbersome while using 
> solr in cloud mode, especially while trying out changes in the 
> configurations. 
> It will be great if there is a request handler that can provide an API to 
> manage the configurations similar to the collections handler that would allow 
> actions like uploading new configurations, linking them to a collection, 
> deleting configurations, etc.
> example : 
> {code}
> #use the following command to upload a new configset called mynewconf. This 
> will fail if there is alredy a conf called 'mynewconf'. The file could be a 
> jar , zip or a tar file which contains all the files for the this conf.
> curl -X POST -H 'Content-Type: application/octet-stream' --data-binary 
> @testconf.zip 
> http://localhost:8983/solr/admin/configs/mynewconf?sig=
> {code}
> A GET to http://localhost:8983/solr/admin/configs will give a list of configs 
> available
> A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the 
> list of files in mynewconf



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper

2017-02-23 Thread Hrishikesh Gadre (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881959#comment-15881959
 ] 

Hrishikesh Gadre commented on SOLR-6736:


[~ichattopadhyaya]

bq. This seems like a sound approach in theory, but often times users don't 
follow proper procedures for deployment and end up exposing their deployments 
without proper authentication/authorization. This extra security is to save 
such users from potential remote code execution based attacks. Our guidelines 
should, anyway, be for admins to enable security before going to production.

Oh I think I misunderstood your earlier statement. So this would not affect the 
setups which have security enabled. 

> A collections-like request handler to manage solr configurations on zookeeper
> -
>
> Key: SOLR-6736
> URL: https://issues.apache.org/jira/browse/SOLR-6736
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Varun Rajput
>Assignee: Ishan Chattopadhyaya
> Attachments: newzkconf.zip, SOLR-6736-newapi.patch, 
> SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, 
> test_pub.der, zkconfighandler.zip, zkconfighandler.zip
>
>
> Managing Solr configuration files on zookeeper becomes cumbersome while using 
> solr in cloud mode, especially while trying out changes in the 
> configurations. 
> It will be great if there is a request handler that can provide an API to 
> manage the configurations similar to the collections handler that would allow 
> actions like uploading new configurations, linking them to a collection, 
> deleting configurations, etc.
> example : 
> {code}
> #use the following command to upload a new configset called mynewconf. This 
> will fail if there is alredy a conf called 'mynewconf'. The file could be a 
> jar , zip or a tar file which contains all the files for the this conf.
> curl -X POST -H 'Content-Type: application/octet-stream' --data-binary 
> @testconf.zip 
> http://localhost:8983/solr/admin/configs/mynewconf?sig=
> {code}
> A GET to http://localhost:8983/solr/admin/configs will give a list of configs 
> available
> A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the 
> list of files in mynewconf



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8440) Script support for enabling basic auth

2017-02-23 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881944#comment-15881944
 ] 

Ishan Chattopadhyaya commented on SOLR-8440:


Agreed.

> Script support for enabling basic auth
> --
>
> Key: SOLR-8440
> URL: https://issues.apache.org/jira/browse/SOLR-8440
> Project: Solr
>  Issue Type: New Feature
>  Components: scripts and tools
>Reporter: Jan Høydahl
>Assignee: Ishan Chattopadhyaya
>  Labels: authentication, security
>
> Now that BasicAuthPlugin will be able to work without an AuthorizationPlugin 
> (SOLR-8429), it would be sweet to provide a super simple way to "Password 
> protect Solr"™ right from the command line:
> {noformat}
> bin/solr basicAuth -adduser -user solr -pass SolrRocks
> {noformat}
> It would take the mystery out of enabling one single password across the 
> board. The command would do something like this
> # Check if HTTPS is enabled, and if not, print a friendly warning
> # Check if {{/security.json}} already exists
> ## NO => create one with only plugin class defined
> ## YES => Abort if exists but plugin is not {{BasicAuthPlugin}}
> # Using security REST API, add the new user



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8440) Script support for enabling basic auth

2017-02-23 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881936#comment-15881936
 ] 

Noble Paul commented on SOLR-8440:
--

I guess we must enable {{RulebasedAuthorization}} as well. ensure that 
{{collection-admin-edit}}, {{core-admin-edit}},{{security-edit}} are protected

> Script support for enabling basic auth
> --
>
> Key: SOLR-8440
> URL: https://issues.apache.org/jira/browse/SOLR-8440
> Project: Solr
>  Issue Type: New Feature
>  Components: scripts and tools
>Reporter: Jan Høydahl
>Assignee: Ishan Chattopadhyaya
>  Labels: authentication, security
>
> Now that BasicAuthPlugin will be able to work without an AuthorizationPlugin 
> (SOLR-8429), it would be sweet to provide a super simple way to "Password 
> protect Solr"™ right from the command line:
> {noformat}
> bin/solr basicAuth -adduser -user solr -pass SolrRocks
> {noformat}
> It would take the mystery out of enabling one single password across the 
> board. The command would do something like this
> # Check if HTTPS is enabled, and if not, print a friendly warning
> # Check if {{/security.json}} already exists
> ## NO => create one with only plugin class defined
> ## YES => Abort if exists but plugin is not {{BasicAuthPlugin}}
> # Using security REST API, add the new user



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper

2017-02-23 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881915#comment-15881915
 ] 

Ishan Chattopadhyaya edited comment on SOLR-6736 at 2/24/17 4:35 AM:
-

{quote}
If it is to simplify the development process, then that can be mitigated by 
setting up a unsecure Solr cluster in the staging environment. In case of a 
production environment all endpoints must be authenticated using the configured 
mechanism if security is enabled.
{quote}
This seems like a sound approach in theory, but often times users don't follow 
proper procedures for deployment and end up exposing their deployments without 
proper authentication/authorization. This extra security is to save such users 
from potential remote code execution based attacks. Our guidelines should, 
anyway, be for admins to enable security before going to production.

Having this feature disabled out of the box was the other alternative that was 
explored above (to protect users who might end up exposing their cluster 
without securing it first), but I think it is inconvenient and can (and should) 
be avoided.


was (Author: ichattopadhyaya):
This seems like a sound approach in theory, but often times users don't follow 
proper procedures for deployment and end up exposing their deployments without 
proper authentication/authorization. This extra security is to save such users 
from potential remote code execution based attacks. Our guidelines should, 
anyway, be for admins to enable security before going to production.

Having this feature disabled out of the box was the other alternative that was 
explored above (to protect users who might end up exposing their cluster 
without securing it first), but I think it is inconvenient and can (and should) 
be avoided.

> A collections-like request handler to manage solr configurations on zookeeper
> -
>
> Key: SOLR-6736
> URL: https://issues.apache.org/jira/browse/SOLR-6736
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Varun Rajput
>Assignee: Ishan Chattopadhyaya
> Attachments: newzkconf.zip, SOLR-6736-newapi.patch, 
> SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, 
> test_pub.der, zkconfighandler.zip, zkconfighandler.zip
>
>
> Managing Solr configuration files on zookeeper becomes cumbersome while using 
> solr in cloud mode, especially while trying out changes in the 
> configurations. 
> It will be great if there is a request handler that can provide an API to 
> manage the configurations similar to the collections handler that would allow 
> actions like uploading new configurations, linking them to a collection, 
> deleting configurations, etc.
> example : 
> {code}
> #use the following command to upload a new configset called mynewconf. This 
> will fail if there is alredy a conf called 'mynewconf'. The file could be a 
> jar , zip or a tar file which contains all the files for the this conf.
> curl -X POST -H 'Content-Type: application/octet-stream' --data-binary 
> @testconf.zip 
> http://localhost:8983/solr/admin/configs/mynewconf?sig=
> {code}
> A GET to http://localhost:8983/solr/admin/configs will give a list of configs 
> available
> A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the 
> list of files in mynewconf



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8440) Script support for enabling basic auth

2017-02-23 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881920#comment-15881920
 ] 

Ishan Chattopadhyaya commented on SOLR-8440:


I'm planning to work on this soon.

> Script support for enabling basic auth
> --
>
> Key: SOLR-8440
> URL: https://issues.apache.org/jira/browse/SOLR-8440
> Project: Solr
>  Issue Type: New Feature
>  Components: scripts and tools
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>  Labels: authentication, security
>
> Now that BasicAuthPlugin will be able to work without an AuthorizationPlugin 
> (SOLR-8429), it would be sweet to provide a super simple way to "Password 
> protect Solr"™ right from the command line:
> {noformat}
> bin/solr basicAuth -adduser -user solr -pass SolrRocks
> {noformat}
> It would take the mystery out of enabling one single password across the 
> board. The command would do something like this
> # Check if HTTPS is enabled, and if not, print a friendly warning
> # Check if {{/security.json}} already exists
> ## NO => create one with only plugin class defined
> ## YES => Abort if exists but plugin is not {{BasicAuthPlugin}}
> # Using security REST API, add the new user



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-8440) Script support for enabling basic auth

2017-02-23 Thread Ishan Chattopadhyaya (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya reassigned SOLR-8440:
--

Assignee: Ishan Chattopadhyaya  (was: Jan Høydahl)

> Script support for enabling basic auth
> --
>
> Key: SOLR-8440
> URL: https://issues.apache.org/jira/browse/SOLR-8440
> Project: Solr
>  Issue Type: New Feature
>  Components: scripts and tools
>Reporter: Jan Høydahl
>Assignee: Ishan Chattopadhyaya
>  Labels: authentication, security
>
> Now that BasicAuthPlugin will be able to work without an AuthorizationPlugin 
> (SOLR-8429), it would be sweet to provide a super simple way to "Password 
> protect Solr"™ right from the command line:
> {noformat}
> bin/solr basicAuth -adduser -user solr -pass SolrRocks
> {noformat}
> It would take the mystery out of enabling one single password across the 
> board. The command would do something like this
> # Check if HTTPS is enabled, and if not, print a friendly warning
> # Check if {{/security.json}} already exists
> ## NO => create one with only plugin class defined
> ## YES => Abort if exists but plugin is not {{BasicAuthPlugin}}
> # Using security REST API, add the new user



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper

2017-02-23 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881919#comment-15881919
 ] 

Noble Paul commented on SOLR-6736:
--

don't disable {{DataImportHandler}},  just disable {{ScriptTransformer}} 

> A collections-like request handler to manage solr configurations on zookeeper
> -
>
> Key: SOLR-6736
> URL: https://issues.apache.org/jira/browse/SOLR-6736
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Varun Rajput
>Assignee: Ishan Chattopadhyaya
> Attachments: newzkconf.zip, SOLR-6736-newapi.patch, 
> SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, 
> test_pub.der, zkconfighandler.zip, zkconfighandler.zip
>
>
> Managing Solr configuration files on zookeeper becomes cumbersome while using 
> solr in cloud mode, especially while trying out changes in the 
> configurations. 
> It will be great if there is a request handler that can provide an API to 
> manage the configurations similar to the collections handler that would allow 
> actions like uploading new configurations, linking them to a collection, 
> deleting configurations, etc.
> example : 
> {code}
> #use the following command to upload a new configset called mynewconf. This 
> will fail if there is alredy a conf called 'mynewconf'. The file could be a 
> jar , zip or a tar file which contains all the files for the this conf.
> curl -X POST -H 'Content-Type: application/octet-stream' --data-binary 
> @testconf.zip 
> http://localhost:8983/solr/admin/configs/mynewconf?sig=
> {code}
> A GET to http://localhost:8983/solr/admin/configs will give a list of configs 
> available
> A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the 
> list of files in mynewconf



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper

2017-02-23 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881915#comment-15881915
 ] 

Ishan Chattopadhyaya commented on SOLR-6736:


This seems like a sound approach in theory, but often times users don't follow 
proper procedures for deployment and end up exposing their deployments without 
proper authentication/authorization. This extra security is to save such users 
from potential remote code execution based attacks. Our guidelines should, 
anyway, be for admins to enable security before going to production.

Having this feature disabled out of the box was the other alternative that was 
explored above (to protect users who might end up exposing their cluster 
without securing it first), but I think it is inconvenient and can (and should) 
be avoided.

> A collections-like request handler to manage solr configurations on zookeeper
> -
>
> Key: SOLR-6736
> URL: https://issues.apache.org/jira/browse/SOLR-6736
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Varun Rajput
>Assignee: Ishan Chattopadhyaya
> Attachments: newzkconf.zip, SOLR-6736-newapi.patch, 
> SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, 
> test_pub.der, zkconfighandler.zip, zkconfighandler.zip
>
>
> Managing Solr configuration files on zookeeper becomes cumbersome while using 
> solr in cloud mode, especially while trying out changes in the 
> configurations. 
> It will be great if there is a request handler that can provide an API to 
> manage the configurations similar to the collections handler that would allow 
> actions like uploading new configurations, linking them to a collection, 
> deleting configurations, etc.
> example : 
> {code}
> #use the following command to upload a new configset called mynewconf. This 
> will fail if there is alredy a conf called 'mynewconf'. The file could be a 
> jar , zip or a tar file which contains all the files for the this conf.
> curl -X POST -H 'Content-Type: application/octet-stream' --data-binary 
> @testconf.zip 
> http://localhost:8983/solr/admin/configs/mynewconf?sig=
> {code}
> A GET to http://localhost:8983/solr/admin/configs will give a list of configs 
> available
> A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the 
> list of files in mynewconf



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper

2017-02-23 Thread Hrishikesh Gadre (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881903#comment-15881903
 ] 

Hrishikesh Gadre edited comment on SOLR-6736 at 2/24/17 4:10 AM:
-

[~ichattopadhyaya]

bq. We can allow unauthenticated/unauthorized users to upload a configset,

I am not following why this endpoint needs to be "unsecure" ? If it is to 
simplify the development process, then that can be mitigated by setting up a 
unsecure Solr cluster in the staging environment. In case of a production 
environment all endpoints must be authenticated using the configured mechanism 
if security is enabled. This request handler should also implement 
PermissionNameProvider interface so that only users which have 
"CONFIG_EDIT_PERM" can update it.




was (Author: hgadre):
[~ichattopadhyaya]

bq. We can allow unauthenticated/unauthorized users to upload a configset,

I am not following why this endpoint needs to be "unsecure" ? If it is to 
simplify the development process, then that can be mitigated by setting up a 
unsecure Solr cluster in the staging environment. In case of a production 
environment - all endpoints must be authenticated using the mechanism 
configured in the security.json. This request handler should also implement 
PermissionNameProvider interface so that only users which have 
"CONFIG_EDIT_PERM" can update it.



> A collections-like request handler to manage solr configurations on zookeeper
> -
>
> Key: SOLR-6736
> URL: https://issues.apache.org/jira/browse/SOLR-6736
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Varun Rajput
>Assignee: Ishan Chattopadhyaya
> Attachments: newzkconf.zip, SOLR-6736-newapi.patch, 
> SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, 
> test_pub.der, zkconfighandler.zip, zkconfighandler.zip
>
>
> Managing Solr configuration files on zookeeper becomes cumbersome while using 
> solr in cloud mode, especially while trying out changes in the 
> configurations. 
> It will be great if there is a request handler that can provide an API to 
> manage the configurations similar to the collections handler that would allow 
> actions like uploading new configurations, linking them to a collection, 
> deleting configurations, etc.
> example : 
> {code}
> #use the following command to upload a new configset called mynewconf. This 
> will fail if there is alredy a conf called 'mynewconf'. The file could be a 
> jar , zip or a tar file which contains all the files for the this conf.
> curl -X POST -H 'Content-Type: application/octet-stream' --data-binary 
> @testconf.zip 
> http://localhost:8983/solr/admin/configs/mynewconf?sig=
> {code}
> A GET to http://localhost:8983/solr/admin/configs will give a list of configs 
> available
> A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the 
> list of files in mynewconf



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper

2017-02-23 Thread Hrishikesh Gadre (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881903#comment-15881903
 ] 

Hrishikesh Gadre edited comment on SOLR-6736 at 2/24/17 4:09 AM:
-

[~ichattopadhyaya]

bq. We can allow unauthenticated/unauthorized users to upload a configset,

I am not following why this endpoint needs to be "unsecure" ? If it is to 
simplify the development process, then that can be mitigated by setting up a 
unsecure Solr cluster in the staging environment. In case of a production 
environment - all endpoints must be authenticated using the mechanism 
configured in the security.json. This request handler should also implement 
PermissionNameProvider interface so that only users which have 
"CONFIG_EDIT_PERM" can update it.




was (Author: hgadre):
[~ichattopadhyaya]

bq. We can allow unauthenticated/unauthorized users to upload a configset,

I am not following why this endpoint needs to be "unsecure" ? If it is to 
simplify the development process, then that can be mitigated by setting up a 
dev Solr cluster in the staging environment. In case of a production 
environment - all endpoints must be authenticated using the mechanism 
configured in the security.json. This request handler should also implement 
PermissionNameProvider interface so that only users which have 
"CONFIG_EDIT_PERM" can update it.



> A collections-like request handler to manage solr configurations on zookeeper
> -
>
> Key: SOLR-6736
> URL: https://issues.apache.org/jira/browse/SOLR-6736
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Varun Rajput
>Assignee: Ishan Chattopadhyaya
> Attachments: newzkconf.zip, SOLR-6736-newapi.patch, 
> SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, 
> test_pub.der, zkconfighandler.zip, zkconfighandler.zip
>
>
> Managing Solr configuration files on zookeeper becomes cumbersome while using 
> solr in cloud mode, especially while trying out changes in the 
> configurations. 
> It will be great if there is a request handler that can provide an API to 
> manage the configurations similar to the collections handler that would allow 
> actions like uploading new configurations, linking them to a collection, 
> deleting configurations, etc.
> example : 
> {code}
> #use the following command to upload a new configset called mynewconf. This 
> will fail if there is alredy a conf called 'mynewconf'. The file could be a 
> jar , zip or a tar file which contains all the files for the this conf.
> curl -X POST -H 'Content-Type: application/octet-stream' --data-binary 
> @testconf.zip 
> http://localhost:8983/solr/admin/configs/mynewconf?sig=
> {code}
> A GET to http://localhost:8983/solr/admin/configs will give a list of configs 
> available
> A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the 
> list of files in mynewconf



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper

2017-02-23 Thread Hrishikesh Gadre (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881903#comment-15881903
 ] 

Hrishikesh Gadre commented on SOLR-6736:


[~ichattopadhyaya]

bq. We can allow unauthenticated/unauthorized users to upload a configset,

I am not following why this endpoint needs to be "unsecure" ? If it is to 
simplify the development process, then that can be mitigated by setting up a 
dev Solr cluster in the staging environment. In case of a production 
environment - all endpoints must be authenticated using the mechanism 
configured in the security.json. This request handler should also implement 
PermissionNameProvider interface so that only users which have 
"CONFIG_EDIT_PERM" can update it.



> A collections-like request handler to manage solr configurations on zookeeper
> -
>
> Key: SOLR-6736
> URL: https://issues.apache.org/jira/browse/SOLR-6736
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Varun Rajput
>Assignee: Ishan Chattopadhyaya
> Attachments: newzkconf.zip, SOLR-6736-newapi.patch, 
> SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, 
> test_pub.der, zkconfighandler.zip, zkconfighandler.zip
>
>
> Managing Solr configuration files on zookeeper becomes cumbersome while using 
> solr in cloud mode, especially while trying out changes in the 
> configurations. 
> It will be great if there is a request handler that can provide an API to 
> manage the configurations similar to the collections handler that would allow 
> actions like uploading new configurations, linking them to a collection, 
> deleting configurations, etc.
> example : 
> {code}
> #use the following command to upload a new configset called mynewconf. This 
> will fail if there is alredy a conf called 'mynewconf'. The file could be a 
> jar , zip or a tar file which contains all the files for the this conf.
> curl -X POST -H 'Content-Type: application/octet-stream' --data-binary 
> @testconf.zip 
> http://localhost:8983/solr/admin/configs/mynewconf?sig=
> {code}
> A GET to http://localhost:8983/solr/admin/configs will give a list of configs 
> available
> A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the 
> list of files in mynewconf



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6203) cast exception while searching with sort function and result grouping

2017-02-23 Thread Judith Silverman (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881896#comment-15881896
 ] 

Judith Silverman edited comment on SOLR-6203 at 2/24/17 4:02 AM:
-

Hi, Christine, are there changes you would like me to make to the patch dated 
05Dec16? As I recall, your commit of 23Dec16 took us part of the way toward 
that patch. 
Do my concerns about SOLR_9660 (02Dec16 above) make sense?
Thanks,
Judith


was (Author: judith):
Hi, Christine, are there changes you would like me to make in the patch dated 
05Dec16? As I recall, your commit of 23Dec16 took us part of the way toward 
that patch. 
Do my concerns about SOLR_9660 (02Dec16 above) make sense?
Thanks,
Judith

> cast exception while searching with sort function and result grouping
> -
>
> Key: SOLR-6203
> URL: https://issues.apache.org/jira/browse/SOLR-6203
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 4.7, 4.8
>Reporter: Nate Dire
>Assignee: Christine Poerschke
> Attachments: README, SOLR-6203.patch, SOLR-6203.patch, 
> SOLR-6203.patch, SOLR-6203.patch, SOLR-6203.patch, SOLR-6203.patch, 
> SOLR-6203-unittest.patch, SOLR-6203-unittest.patch
>
>
> After upgrading from 4.5.1 to 4.7+, a schema including a {{"*"}} dynamic 
> field as text gets a cast exception when using a sort function and result 
> grouping.  
> Repro (with example config):
> # Add {{"*"}} dynamic field as a {{TextField}}, eg:
> {noformat}
> 
> {noformat}
> #  Create  sharded collection
> {noformat}
> curl 
> 'http://localhost:8983/solr/admin/collections?action=CREATE=test=2=2'
> {noformat}
> # Add example docs (query must have some results)
> # Submit query which sorts on a function result and uses result grouping:
> {noformat}
> {
>   "responseHeader": {
> "status": 500,
> "QTime": 50,
> "params": {
>   "sort": "sqrt(popularity) desc",
>   "indent": "true",
>   "q": "*:*",
>   "_": "1403709010008",
>   "group.field": "manu",
>   "group": "true",
>   "wt": "json"
> }
>   },
>   "error": {
> "msg": "java.lang.Double cannot be cast to 
> org.apache.lucene.util.BytesRef",
> "code": 500
>   }
> }
> {noformat}
> Source exception from log:
> {noformat}
> ERROR - 2014-06-25 08:10:10.055; org.apache.solr.common.SolrException; 
> java.lang.ClassCastException: java.lang.Double cannot be cast to 
> org.apache.lucene.util.BytesRef
> at 
> org.apache.solr.schema.FieldType.marshalStringSortValue(FieldType.java:981)
> at org.apache.solr.schema.TextField.marshalSortValue(TextField.java:176)
> at 
> org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.serializeSearchGroup(SearchGroupsResultTransformer.java:125)
> at 
> org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.transform(SearchGroupsResultTransformer.java:65)
> at 
> org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.transform(SearchGroupsResultTransformer.java:43)
> at 
> org.apache.solr.search.grouping.CommandHandler.processResult(CommandHandler.java:193)
> at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:340)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>   ...
> {noformat}
> It looks like {{serializeSearchGroup}} is matching the sort expression as the 
> {{"*"}} dynamic field, which is a TextField in the repro.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6203) cast exception while searching with sort function and result grouping

2017-02-23 Thread Judith Silverman (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881896#comment-15881896
 ] 

Judith Silverman commented on SOLR-6203:


Hi, Christine, are there changes you would like me to make in the patch dated 
05Dec16? As I recall, your commit of 23Dec16 took us part of the way toward 
that patch. 
Do my concerns about SOLR_9660 (02Dec16 above) make sense?
Thanks,
Judith

> cast exception while searching with sort function and result grouping
> -
>
> Key: SOLR-6203
> URL: https://issues.apache.org/jira/browse/SOLR-6203
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 4.7, 4.8
>Reporter: Nate Dire
>Assignee: Christine Poerschke
> Attachments: README, SOLR-6203.patch, SOLR-6203.patch, 
> SOLR-6203.patch, SOLR-6203.patch, SOLR-6203.patch, SOLR-6203.patch, 
> SOLR-6203-unittest.patch, SOLR-6203-unittest.patch
>
>
> After upgrading from 4.5.1 to 4.7+, a schema including a {{"*"}} dynamic 
> field as text gets a cast exception when using a sort function and result 
> grouping.  
> Repro (with example config):
> # Add {{"*"}} dynamic field as a {{TextField}}, eg:
> {noformat}
> 
> {noformat}
> #  Create  sharded collection
> {noformat}
> curl 
> 'http://localhost:8983/solr/admin/collections?action=CREATE=test=2=2'
> {noformat}
> # Add example docs (query must have some results)
> # Submit query which sorts on a function result and uses result grouping:
> {noformat}
> {
>   "responseHeader": {
> "status": 500,
> "QTime": 50,
> "params": {
>   "sort": "sqrt(popularity) desc",
>   "indent": "true",
>   "q": "*:*",
>   "_": "1403709010008",
>   "group.field": "manu",
>   "group": "true",
>   "wt": "json"
> }
>   },
>   "error": {
> "msg": "java.lang.Double cannot be cast to 
> org.apache.lucene.util.BytesRef",
> "code": 500
>   }
> }
> {noformat}
> Source exception from log:
> {noformat}
> ERROR - 2014-06-25 08:10:10.055; org.apache.solr.common.SolrException; 
> java.lang.ClassCastException: java.lang.Double cannot be cast to 
> org.apache.lucene.util.BytesRef
> at 
> org.apache.solr.schema.FieldType.marshalStringSortValue(FieldType.java:981)
> at org.apache.solr.schema.TextField.marshalSortValue(TextField.java:176)
> at 
> org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.serializeSearchGroup(SearchGroupsResultTransformer.java:125)
> at 
> org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.transform(SearchGroupsResultTransformer.java:65)
> at 
> org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.transform(SearchGroupsResultTransformer.java:43)
> at 
> org.apache.solr.search.grouping.CommandHandler.processResult(CommandHandler.java:193)
> at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:340)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>   ...
> {noformat}
> It looks like {{serializeSearchGroup}} is matching the sort expression as the 
> {{"*"}} dynamic field, which is a TextField in the repro.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-10200) Streaming Expressions should uses the shards parameter if present

2017-02-23 Thread Joel Bernstein (JIRA)
Joel Bernstein created SOLR-10200:
-

 Summary: Streaming Expressions should uses the shards parameter if 
present
 Key: SOLR-10200
 URL: https://issues.apache.org/jira/browse/SOLR-10200
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Joel Bernstein


Currently Streaming Expressions select shards using an internal ZooKeeper 
client. This ticket will allow stream sources to except a *shards* parameter so 
that non-SolrCloud deployments can set the shards manually.

The shards parameters will be added as http parameters in the following format:

collectionA.shards=url1,url1,...=url1,url2...

The /stream handler with then add the shards to the StreamContext so all stream 
sources can check to see if there collection has the shards set manually.






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper

2017-02-23 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881883#comment-15881883
 ] 

Ishan Chattopadhyaya edited comment on SOLR-6736 at 2/24/17 3:42 AM:
-

I have created a branch jira/solr-6736 with the latest patch (after updating it 
for master).

Regarding the security vulnerability that this new API exposes, I have the 
following thoughts to take this forward:
# We can allow unauthenticated/unauthorized users to upload a configset, but 
mark such configsets with a "trusted=false" flag while storing in ZK (metadata 
on the configset's znode). If this endpoint is secured using authorization and 
authentication, then we can store the uploaded configsets with "trusted=true".
# Upon creation of a collection using an untrusted configset, any attempt to 
register a "vulnerable" component, e.g. StatelessScriptUpdateProcessor, 
XsltUpdateRequestHandler, DataImportHandler etc., should fail with an error 
that indicates that the configset was not trusted and it can be made trusted by 
enabling authentication/authorization for the API endpoint and re-uploading the 
configset. Same error when using a config API command to register any update 
handler using an untrusted configset.
# Ensure that untrusted configsets never overwrite existing trusted configsets.

As a separate exercise, we should audit our use of the XML parser to ensure XXE 
attacks are not possible on XML files, either uploaded from here/elsewhere or 
loaded from the disk.

[~varunrajput], [~anshumg], [~noble.paul], WDYT?


was (Author: ichattopadhyaya):
I have created a branch jira/solr-6736 with the latest patch (after updating it 
for master).

Regarding the security vulnerability that this new API exposes, I have the 
following thoughts to take this forward:
# We can allow unauthenticated/unauthorized users to upload a configset, but 
mark such configsets with a "trusted=false" flag while storing in ZK (metadata 
on the configset's znode). If this endpoint is secured using authorization and 
authentication, then we can store the uploaded configsets with "trusted=true".
# Upon creation of a collection using an untrusted configset, any attempt to 
register a "vulnerable" component, e.g. StatelessScriptUpdateProcessor, 
XsltUpdateRequestHandler, DataImportHandler etc., should fail with an error 
that indicates that the configset was not trusted and it can be made trusted by 
enabling authentication/authorization for the API endpoint and re-uploading the 
configset. Same error when using a config API command to register any update 
handler using an untrusted configset.

As a separate exercise, we should audit our use of the XML parser to ensure XXE 
attacks are not possible on XML files, either uploaded from here/elsewhere or 
loaded from the disk.

[~varunrajput], [~anshumg], [~noble.paul], WDYT?

> A collections-like request handler to manage solr configurations on zookeeper
> -
>
> Key: SOLR-6736
> URL: https://issues.apache.org/jira/browse/SOLR-6736
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Varun Rajput
>Assignee: Ishan Chattopadhyaya
> Attachments: newzkconf.zip, SOLR-6736-newapi.patch, 
> SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, 
> test_pub.der, zkconfighandler.zip, zkconfighandler.zip
>
>
> Managing Solr configuration files on zookeeper becomes cumbersome while using 
> solr in cloud mode, especially while trying out changes in the 
> configurations. 
> It will be great if there is a request handler that can provide an API to 
> manage the configurations similar to the collections handler that would allow 
> actions like uploading new configurations, linking them to a collection, 
> deleting configurations, etc.
> example : 
> {code}
> #use the following command to upload a new configset called mynewconf. This 
> will fail if there is alredy a conf called 'mynewconf'. The file could be a 
> jar , zip or a tar file which contains all the files for the this conf.
> curl -X POST -H 'Content-Type: application/octet-stream' --data-binary 
> @testconf.zip 
> http://localhost:8983/solr/admin/configs/mynewconf?sig=
> {code}
> A GET to http://localhost:8983/solr/admin/configs will give a list of configs 
> available
> A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the 
> list of files in mynewconf



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: 

[jira] [Commented] (SOLR-6736) A collections-like request handler to manage solr configurations on zookeeper

2017-02-23 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881883#comment-15881883
 ] 

Ishan Chattopadhyaya commented on SOLR-6736:


I have created a branch jira/solr-6736 with the latest patch (after updating it 
for master).

Regarding the security vulnerability that this new API exposes, I have the 
following thoughts to take this forward:
# We can allow unauthenticated/unauthorized users to upload a configset, but 
mark such configsets with a "trusted=false" flag while storing in ZK (metadata 
on the configset's znode). If this endpoint is secured using authorization and 
authentication, then we can store the uploaded configsets with "trusted=true".
# Upon creation of a collection using an untrusted configset, any attempt to 
register a "vulnerable" component, e.g. StatelessScriptUpdateProcessor, 
XsltUpdateRequestHandler, DataImportHandler etc., should fail with an error 
that indicates that the configset was not trusted and it can be made trusted by 
enabling authentication/authorization for the API endpoint and re-uploading the 
configset. Same error when using a config API command to register any update 
handler using an untrusted configset.

As a separate exercise, we should audit our use of the XML parser to ensure XXE 
attacks are not possible on XML files, either uploaded from here/elsewhere or 
loaded from the disk.

[~varunrajput], [~anshumg], [~noble.paul], WDYT?

> A collections-like request handler to manage solr configurations on zookeeper
> -
>
> Key: SOLR-6736
> URL: https://issues.apache.org/jira/browse/SOLR-6736
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Varun Rajput
>Assignee: Ishan Chattopadhyaya
> Attachments: newzkconf.zip, SOLR-6736-newapi.patch, 
> SOLR-6736-newapi.patch, SOLR-6736-newapi.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, 
> SOLR-6736.patch, SOLR-6736.patch, SOLR-6736.patch, test_private.pem, 
> test_pub.der, zkconfighandler.zip, zkconfighandler.zip
>
>
> Managing Solr configuration files on zookeeper becomes cumbersome while using 
> solr in cloud mode, especially while trying out changes in the 
> configurations. 
> It will be great if there is a request handler that can provide an API to 
> manage the configurations similar to the collections handler that would allow 
> actions like uploading new configurations, linking them to a collection, 
> deleting configurations, etc.
> example : 
> {code}
> #use the following command to upload a new configset called mynewconf. This 
> will fail if there is alredy a conf called 'mynewconf'. The file could be a 
> jar , zip or a tar file which contains all the files for the this conf.
> curl -X POST -H 'Content-Type: application/octet-stream' --data-binary 
> @testconf.zip 
> http://localhost:8983/solr/admin/configs/mynewconf?sig=
> {code}
> A GET to http://localhost:8983/solr/admin/configs will give a list of configs 
> available
> A GET to http://localhost:8983/solr/admin/configs/mynewconf would give the 
> list of files in mynewconf



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression

2017-02-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-10156:
--
Description: 
The significantTerms Streaming Expression will emit a set of terms from a *text 
field* within a doc frequency range for a specific query. It will also score 
the terms based on how many times the terms appear in the result set, and how 
many times the terms appear in the corpus, and return the top N terms based on 
this significance score.

Syntax:

{code}
significantTerms(collection, 
 q="any query", 
 field="some_text_field", 
 minDocFreq="5",   //optional default is 5 documents
 maxDocFreq=".3", // optional default is no more then 30% of 
the index (.3)
 minTermLength="4",  // optional default is 4
 limit="50")// optional default is 20
{code}




  was:
The significantTerms Streaming Expression will emit a set of terms from a *text 
field* within a doc frequency range for a specific query. It will also score 
the terms based on how many times the terms appear in the result set, and how 
many times the terms appear in the corpus, and return the top N terms based on 
this significance score.

Syntax:

{code}
significantTerms(collection, 
 q="any query", 
 field="some_text_field", 
 minDocFreq="5",   //optional default is 5 documents
 maxDocFreq=".3", // optional default is no more then 30% of 
the index (.3)
 minTermlength="4",  // optional default is 4
 limit="50")// optional default is 20
{code}





> Add significantTerms Streaming Expression
> -
>
> Key: SOLR-10156
> URL: https://issues.apache.org/jira/browse/SOLR-10156
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5
>
> Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch
>
>
> The significantTerms Streaming Expression will emit a set of terms from a 
> *text field* within a doc frequency range for a specific query. It will also 
> score the terms based on how many times the terms appear in the result set, 
> and how many times the terms appear in the corpus, and return the top N terms 
> based on this significance score.
> Syntax:
> {code}
> significantTerms(collection, 
>  q="any query", 
>  field="some_text_field", 
>  minDocFreq="5",   //optional default is 5 documents
>  maxDocFreq=".3", // optional default is no more then 30% of 
> the index (.3)
>  minTermLength="4",  // optional default is 4
>  limit="50")// optional default is 20
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Linux (32bit/jdk1.8.0_121) - Build # 19037 - Unstable!

2017-02-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/19037/
Java: 32bit/jdk1.8.0_121 -client -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  org.apache.solr.handler.extraction.TestExtractionDateUtil.testParseDate

Error Message:
Incorrect parsed timestamp: 1226583351000 != 1226579751000 (Thu Nov 13 04:35:51 
AKST 2008)

Stack Trace:
java.lang.AssertionError: Incorrect parsed timestamp: 1226583351000 != 
1226579751000 (Thu Nov 13 04:35:51 AKST 2008)
at 
__randomizedtesting.SeedInfo.seed([48772F467DACD6:4A510F1A3DD4DB63]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.handler.extraction.TestExtractionDateUtil.assertParsedDate(TestExtractionDateUtil.java:59)
at 
org.apache.solr.handler.extraction.TestExtractionDateUtil.testParseDate(TestExtractionDateUtil.java:54)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 18343 lines...]
   [junit4] Suite: org.apache.solr.handler.extraction.TestExtractionDateUtil
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TestExtractionDateUtil -Dtests.method=testParseDate 
-Dtests.seed=48772F467DACD6 -Dtests.multiplier=3 -Dtests.slow=true 
-Dtests.locale=sr-ME -Dtests.timezone=America/Metlakatla -Dtests.asserts=true 

[jira] [Updated] (SOLR-9835) Create another replication mode for SolrCloud

2017-02-23 Thread Cao Manh Dat (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-9835:
---
Attachment: (was: SOLR-9835.patch)

> Create another replication mode for SolrCloud
> -
>
> Key: SOLR-9835
> URL: https://issues.apache.org/jira/browse/SOLR-9835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch
>
>
> The current replication mechanism of SolrCloud is called state machine, which 
> replicas start in same initial state and for each input, the input is 
> distributed across replicas so all replicas will end up with same next state. 
> But this type of replication have some drawbacks
> - The commit (which costly) have to run on all replicas
> - Slow recovery, because if replica miss more than N updates on its down 
> time, the replica have to download entire index from its leader.
> So we create create another replication mode for SolrCloud called state 
> transfer, which acts like master/slave replication. In basically
> - Leader distribute the update to other replicas, but the leader only apply 
> the update to IW, other replicas just store the update to UpdateLog (act like 
> replication).
> - Replicas frequently polling the latest segments from leader.
> Pros:
> - Lightweight for indexing, because only leader are running the commit, 
> updates.
> - Very fast recovery, replicas just have to download the missing segments.
> On CAP point of view, this ticket will trying to promise to end users a 
> distributed systems :
> - Partition tolerance
> - Weak Consistency for normal query : clusters can serve stale data. This 
> happen when leader finish a commit and slave is fetching for latest segment. 
> This period can at most {{pollInterval + time to fetch latest segment}}.
> - Consistency for RTG : if we *do not use DQBs*, replicas will consistence 
> with master just like original SolrCloud mode
> - Weak Availability : just like original SolrCloud mode. If a leader down, 
> client must wait until new leader being elected.
> To use this new replication mode, a new collection must be created with an 
> additional parameter {{liveReplicas=1}}
> {code}
> http://localhost:8983/solr/admin/collections?action=CREATE=newCollection=2=1=1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9835) Create another replication mode for SolrCloud

2017-02-23 Thread Cao Manh Dat (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-9835:
---
Attachment: SOLR-9835.patch

Updated patch based on comments of [~shalinmangar] and [~ichattopadhyaya]

bq. 2. ZkController.register method – The condition for !isLeader && 
onlyLeaderIndexes can be replaced by the isReplicaInOnlyLeaderIndexes variable.
Done!
bq. 3. Since there is no log replay on startup on replicas anymore, what if the 
replica is killed (which keeps its state as 'active' in ZK) and then the 
cluster is restarted and the replica becomes leader candidate? If we do not 
replay the discarded log then it could lead to data loss?
To solve this problem, we call {{copyOverOldUpdates}} from the last recent tlog 
on startup.
bq. 4. UpdateLog – Can you please add javadocs outlining the motivation/purpose 
of the new methods such as copyOverBufferingUpdates and switchToNewTlog e.g. 
why does switchToNewTlog require copying over some updates from the old tlog?
Done!
bq. 6. UpdateLog – why does copyOverBufferUpdates block updates while calling 
switchToNewTlog but ReplicateFromLeader doesn't? How are they both safe?
Both of them are blocking updates now.
bq. 8. ZkController.startReplicationFromLeader – Using a ConcurrentHashMap is 
not enough to prevent two simultaneous replications from happening 
concurrently. You should use the atomic putIfAbsent to put a core to the map 
before starting replication.
Done!

bq.Also, lets add a simple test to ensure that in-place updates work on a 
replica

I modified {{TestInPlaceUpdatesDistrib}} to run in random mode. If the tests 
run on the new mode, we will skip some outOfOderDBQs tests.

> Create another replication mode for SolrCloud
> -
>
> Key: SOLR-9835
> URL: https://issues.apache.org/jira/browse/SOLR-9835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch
>
>
> The current replication mechanism of SolrCloud is called state machine, which 
> replicas start in same initial state and for each input, the input is 
> distributed across replicas so all replicas will end up with same next state. 
> But this type of replication have some drawbacks
> - The commit (which costly) have to run on all replicas
> - Slow recovery, because if replica miss more than N updates on its down 
> time, the replica have to download entire index from its leader.
> So we create create another replication mode for SolrCloud called state 
> transfer, which acts like master/slave replication. In basically
> - Leader distribute the update to other replicas, but the leader only apply 
> the update to IW, other replicas just store the update to UpdateLog (act like 
> replication).
> - Replicas frequently polling the latest segments from leader.
> Pros:
> - Lightweight for indexing, because only leader are running the commit, 
> updates.
> - Very fast recovery, replicas just have to download the missing segments.
> On CAP point of view, this ticket will trying to promise to end users a 
> distributed systems :
> - Partition tolerance
> - Weak Consistency for normal query : clusters can serve stale data. This 
> happen when leader finish a commit and slave is fetching for latest segment. 
> This period can at most {{pollInterval + time to fetch latest segment}}.
> - Consistency for RTG : if we *do not use DQBs*, replicas will consistence 
> with master just like original SolrCloud mode
> - Weak Availability : just like original SolrCloud mode. If a leader down, 
> client must wait until new leader being elected.
> To use this new replication mode, a new collection must be created with an 
> additional parameter {{liveReplicas=1}}
> {code}
> http://localhost:8983/solr/admin/collections?action=CREATE=newCollection=2=1=1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9835) Create another replication mode for SolrCloud

2017-02-23 Thread Cao Manh Dat (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-9835:
---
Attachment: SOLR-9835.patch

> Create another replication mode for SolrCloud
> -
>
> Key: SOLR-9835
> URL: https://issues.apache.org/jira/browse/SOLR-9835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch
>
>
> The current replication mechanism of SolrCloud is called state machine, which 
> replicas start in same initial state and for each input, the input is 
> distributed across replicas so all replicas will end up with same next state. 
> But this type of replication have some drawbacks
> - The commit (which costly) have to run on all replicas
> - Slow recovery, because if replica miss more than N updates on its down 
> time, the replica have to download entire index from its leader.
> So we create create another replication mode for SolrCloud called state 
> transfer, which acts like master/slave replication. In basically
> - Leader distribute the update to other replicas, but the leader only apply 
> the update to IW, other replicas just store the update to UpdateLog (act like 
> replication).
> - Replicas frequently polling the latest segments from leader.
> Pros:
> - Lightweight for indexing, because only leader are running the commit, 
> updates.
> - Very fast recovery, replicas just have to download the missing segments.
> On CAP point of view, this ticket will trying to promise to end users a 
> distributed systems :
> - Partition tolerance
> - Weak Consistency for normal query : clusters can serve stale data. This 
> happen when leader finish a commit and slave is fetching for latest segment. 
> This period can at most {{pollInterval + time to fetch latest segment}}.
> - Consistency for RTG : if we *do not use DQBs*, replicas will consistence 
> with master just like original SolrCloud mode
> - Weak Availability : just like original SolrCloud mode. If a leader down, 
> client must wait until new leader being elected.
> To use this new replication mode, a new collection must be created with an 
> additional parameter {{liveReplicas=1}}
> {code}
> http://localhost:8983/solr/admin/collections?action=CREATE=newCollection=2=1=1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8182) TestSolrCloudWithKerberosAlt fails consistently on JDK9

2017-02-23 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881732#comment-15881732
 ] 

Hoss Man commented on SOLR-8182:


It's not clear to me if the initially reported test failures (pre-jigsaw) was a 
JVM bug that's been fixed in more recent Java9 EA builds, or if the underlying 
probably still exists (either in the JVM or in Solr) but we never get that far 
because of jigsaw related failures.

We almost certainly won't know the answer until SOLR-8052 and SOLR-10199 are 
resolved, so marking this bug as bocked by both of those.

> TestSolrCloudWithKerberosAlt fails consistently on JDK9
> ---
>
> Key: SOLR-8182
> URL: https://issues.apache.org/jira/browse/SOLR-8182
> Project: Solr
>  Issue Type: Test
>  Components: security, SolrCloud
>Reporter: Shalin Shekhar Mangar
>Priority: Minor
>  Labels: Java9
> Fix For: 5.5, 6.0
>
>
> The test fails consistently on JDK9 with the following initialization error:
> {code}
> FAILED:  org.apache.solr.cloud.TestSolrCloudWithKerberosAlt.testBasics
> Error Message:
> org.apache.directory.api.ldap.model.exception.LdapOtherException: 
> ERR_04447_CANNOT_NORMALIZE_VALUE Cannot normalize the wrapped value 
> ERR_04473_NOT_VALID_VALUE Not a valid value '20090818022733Z' for the 
> AttributeType 'ATTRIBUTE_TYPE ( 1.3.6.1.4.1.18060.0.4.1.2.35  NAME 
> 'schemaModifyTimestamp'  DESC time which schema was modified  SUP 
> modifyTimestamp  EQUALITY generalizedTimeMatch  ORDERING 
> generalizedTimeOrderingMatch  SYNTAX 1.3.6.1.4.1.1466.115.121.1.24  USAGE 
> directoryOperation  ) '
> Stack Trace:
> org.apache.directory.api.ldap.model.exception.LdapOtherException: 
> org.apache.directory.api.ldap.model.exception.LdapOtherException: 
> ERR_04447_CANNOT_NORMALIZE_VALUE Cannot normalize the wrapped value 
> ERR_04473_NOT_VALID_VALUE Not a valid value '20090818022733Z' for the 
> AttributeType 'ATTRIBUTE_TYPE ( 1.3.6.1.4.1.18060.0.4.1.2.35
>  NAME 'schemaModifyTimestamp'
>  DESC time which schema was modified
>  SUP modifyTimestamp
>  EQUALITY generalizedTimeMatch
>  ORDERING generalizedTimeOrderingMatch
>  SYNTAX 1.3.6.1.4.1.1466.115.121.1.24
>  USAGE directoryOperation
>  )
> '
> at 
> __randomizedtesting.SeedInfo.seed([321A63D948BF59B7:FC2CDF5705107C7]:0)
> at 
> org.apache.directory.server.core.api.partition.AbstractPartition.initialize(AbstractPartition.java:84)
> at 
> org.apache.directory.server.core.DefaultDirectoryService.initialize(DefaultDirectoryService.java:1808)
> at 
> org.apache.directory.server.core.DefaultDirectoryService.startup(DefaultDirectoryService.java:1248)
> at 
> org.apache.hadoop.minikdc.MiniKdc.initDirectoryService(MiniKdc.java:383)
> at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:319)
> at 
> org.apache.solr.cloud.TestSolrCloudWithKerberosAlt.setupMiniKdc(TestSolrCloudWithKerberosAlt.java:105)
> at 
> org.apache.solr.cloud.TestSolrCloudWithKerberosAlt.setUp(TestSolrCloudWithKerberosAlt.java:94)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8052) Tests using MiniKDC do not work with Java 9 Jigsaw

2017-02-23 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881720#comment-15881720
 ] 

Hoss Man commented on SOLR-8052:


Created SOLR-10199 to track the *non-test* code problems with using Solr's 
kerberos features in Java9 (discovered with this patch)

> Tests using MiniKDC do not work with Java 9 Jigsaw
> --
>
> Key: SOLR-8052
> URL: https://issues.apache.org/jira/browse/SOLR-8052
> Project: Solr
>  Issue Type: Bug
>  Components: Authentication
>Affects Versions: 5.3
>Reporter: Uwe Schindler
>  Labels: Java9
> Attachments: SOLR-8052.patch
>
>
> As described in my status update yesterday, there are some problems in 
> dependencies shipped with Solr that don't work with Java 9 Jigsaw builds.
> org.apache.solr.cloud.SaslZkACLProviderTest.testSaslZkACLProvider
> {noformat}
>[junit4]> Throwable #1: java.lang.RuntimeException: 
> java.lang.IllegalAccessException: Class org.apache.hadoop.minikdc.MiniKdc can 
> not access a member of class sun.security.krb5.Config (module 
> java.security.jgss) with modifiers "public static", module java.security.jgss 
> does not export sun.security.krb5 to 
>[junit4]>at 
> org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.run(SaslZkACLProviderTest.java:211)
>[junit4]>at 
> org.apache.solr.cloud.SaslZkACLProviderTest.setUp(SaslZkACLProviderTest.java:81)
>[junit4]>at java.lang.Thread.run(java.base@9.0/Thread.java:746)
>[junit4]> Caused by: java.lang.IllegalAccessException: Class 
> org.apache.hadoop.minikdc.MiniKdc can not access a member of class 
> sun.security.krb5.Config (module java.security.jgss) with modifiers "public 
> static", module java.security.jgss does not export sun.security.krb5 to 
> 
>[junit4]>at 
> java.lang.reflect.AccessibleObject.slowCheckMemberAccess(java.base@9.0/AccessibleObject.java:384)
>[junit4]>at 
> java.lang.reflect.AccessibleObject.checkAccess(java.base@9.0/AccessibleObject.java:376)
>[junit4]>at 
> org.apache.hadoop.minikdc.MiniKdc.initKDCServer(MiniKdc.java:478)
>[junit4]>at 
> org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:320)
>[junit4]>at 
> org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.run(SaslZkACLProviderTest.java:204)
>[junit4]>... 38 moreThrowable #2: 
> java.lang.NullPointerException
>[junit4]>at 
> org.apache.solr.cloud.ZkTestServer$ZKServerMain.shutdown(ZkTestServer.java:334)
>[junit4]>at 
> org.apache.solr.cloud.ZkTestServer.shutdown(ZkTestServer.java:526)
>[junit4]>at 
> org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.shutdown(SaslZkACLProviderTest.java:218)
>[junit4]>at 
> org.apache.solr.cloud.SaslZkACLProviderTest.tearDown(SaslZkACLProviderTest.java:116)
>[junit4]>at java.lang.Thread.run(java.base@9.0/Thread.java:746)
> {noformat}
> This is really bad, bad, bad! All security related stuff should never ever be 
> reflected on!
> So we have to open issue in MiniKdc project so they remove the "hacks". 
> Elasticsearch had
> similar problems with Amazon's AWS API. The worked around with a funny hack 
> in their SecurityPolicy
> (https://github.com/elastic/elasticsearch/pull/13538). But as Solr does not 
> run with SecurityManager
> in production, there is no way to do that. 
> We should report issue on the MiniKdc project, so they fix their code and 
> remove the really bad reflection on Java's internal classes.
> FYI, my 
> [conclusion|http://mail-archives.apache.org/mod_mbox/lucene-dev/201509.mbox/%3C014801d0ee23%245c8f5df0%2415ae19d0%24%40thetaphi.de%3E]
>  from yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-10199) Solr's Kerberos functionaliy does not work in Java9 due to dependency on hadoop's AuthenticationFilter which attempt access to JVM protected classes

2017-02-23 Thread Hoss Man (JIRA)
Hoss Man created SOLR-10199:
---

 Summary: Solr's Kerberos functionaliy does not work in Java9 due 
to dependency on hadoop's AuthenticationFilter which attempt access to JVM 
protected classes
 Key: SOLR-10199
 URL: https://issues.apache.org/jira/browse/SOLR-10199
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hoss Man


(discovered this while working on test improvements for SOLR-8052)

Our Kerberos based authn/authz features are all built on top of Hadoop's 
{{AuthenticationFilter}} which in turn uses Hadoop's {{KerberosUtil}} -- but 
this does not work on Java9/jigsaw JVMs because that class in turn attempts to 
access {{sun.security.jgss.GSSUtil}} which is not exported by {{module 
java.security.jgss}}

This means that Solr users who depend on Kerberos will not be able to upgrade 
to Java9, even if they do not use any Hadoop specific features of Solr.



Example log messages...

{noformat}
   [junit4]   2> 6833 WARN  (qtp442059499-30) [] 
o.a.h.s.a.s.AuthenticationFilter Authentication exception: 
java.lang.IllegalAccessException: class 
org.apache.hadoop.security.authentication.util.KerberosUtil cannot access class 
sun.security.jgss.GSSUtil (in module java.security.jgss) because module 
java.security.jgss does not export sun.security.jgss to unnamed module @4b38fe8b
   [junit4]   2> 6841 WARN  
(TEST-TestSolrCloudWithKerberosAlt.testBasics-seed#[95A583AF82D1EBBE]) [] 
o.a.h.c.p.ResponseProcessCookies Invalid cookie header: "Set-Cookie: 
hadoop.auth=; Path=/; Domain=127.0.0.1; Expires=Ara, 01-Sa-1970 00:00:00 GMT; 
HttpOnly". Invalid 'expires' attribute: Ara, 01-Sa-1970 00:00:00 GMT
{noformat}

(NOTE: HADOOP-14115 is cause of malformed cookie expiration)

ultimately the client gets a 403 error (as seen in a testcase with patch from 
SOLR-8052 applied and java9 assume commented out)...

{noformat}
   [junit4] ERROR   7.10s | TestSolrCloudWithKerberosAlt.testBasics <<<
   [junit4]> Throwable #1: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://127.0.0.1:34687/solr: Expected mime type 
application/octet-stream but got text/html. 
   [junit4]> 
   [junit4]> 
   [junit4]> Error 403 
   [junit4]> 
   [junit4]> 
   [junit4]> HTTP ERROR: 403
   [junit4]> Problem accessing /solr/admin/collections. Reason:
   [junit4]> java.lang.IllegalAccessException: class 
org.apache.hadoop.security.authentication.util.KerberosUtil cannot access class 
sun.security.jgss.GSSUtil (in module java.security.jgss) because module 
java.security.jgss does not export sun.security.jgss to unnamed module 
@4b38fe8b
   [junit4]> http://eclipse.org/jetty;>Powered by Jetty:// 
9.3.14.v20161028
   [junit4]> 
   [junit4]> 
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8052) Tests using MiniKDC do not work with Java 9 Jigsaw

2017-02-23 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-8052:
---
Attachment: SOLR-8052.patch


I've been digging into some of the Java9 related SOLR jiras -- starting with 
the kerberos based test problems -- to try and figure out if these really are 
test only bugs and/or if there is anything we can do about making things work 
better.

Based on my initial reading/experimenting, I think we should replace MiniKdc 
(from hadoop's test infrastrcture) with SimpleKdcServer (from the apache kerby 
project)...

* SimpleKdcServer does not appear to have reflection related bugs that cause 
problems under jigsaw like MiniKdc does
* SimpleKdcServer does not suffer from the same "can't use multiple nodes" 
problem (HADOOP-9893) that has required {{TestMiniSolrCloudClusterKerberos}} to 
be {{@Ignored}} since it was created.
** I was able to add multiple solr nodes to {{TestSolrCloudWithKerberosAlt}} 
w/o problems after switching
** With a few other modifications, I was able to get 
{{TestMiniSolrCloudClusterKerberos}} to work as well (details below)
* In hadoop's master branch, MiniKdc has been refactored to use SimpleKdcServer 
internally anyway

Doing this isn't a silver bullet for the java9/jigsaw related failures (I'll 
file a new but about that), but it should help us move forward -- and in 
general seems like an improvement.



The attached patch is a starting point for this change.

One thing I'm not particularly happy with here is that in order to get it to 
pass, I _had_ to modify {{TestMiniSolrCloudClusterKerberos}} to create a single 
{{KerberosTestServices}} instance in the {{@BeforeClass}} method, instead of in 
a regular {{@Before}} method.

*In and of itself, this change isn't neccessarily bad -- it just means we only 
start one Kerberos server instead of one per method.*

What concerns me is that w/o this change, only the first test method would ever 
pass, and subsequent test methods would log/throw errors from ZK -- and running 
any single test method with {{-Dtests.method}} would (seemingly) always pass.

My initial suspicion was that something in {{SimpleKdcServer}}, or in our 
{{KerberosTestServices}} wrapper, wasn't "resetting" the JVM security settings 
correctly when we shut it down -- but if that were the case I would expect 
something like {{ant test -Dtests.jvms=1 -Dtests.class=\*Kerber\*}} to fail 
(even with {{KerberosTestServices}} only ever being instnatiated once per test 
class) when the (sole) Test JVM got to the second test class and instantated a 
second {{KerberosTestServices}} instance.

However that doesn't seem to be the case.  For some reason, using only one 
{{KerberosTestServices}} in a test class is fine, regardless of how many test 
classes using kerberos run in that JVM, but using multiple 
{{KerberosTestServices}} in a single test class causes kerberos failures.

For the purposes of demonstrating this (in contrast with the changes made in 
{{TestMiniSolrCloudClusterKerberos}} which seem like a good idea either way) 
I've added a {{TestHossSanity}} which works just like 
{{TestMiniSolrCloudClusterKerberos}} except initializes 
{{KerberosTestServices}} on a per test-method basis.

Examples of the types of Kerberose errors it logs (after the first test method 
succeeds)...

{noformat}
...
   [junit4]  says 你好! Master seed: 6BEDD90DB0D4DC38
   [junit4] Executing 1 suite with 1 JVM.
   [junit4] 
   [junit4] Started J0 PID(11220@tray).
   [junit4] Suite: org.apache.solr.cloud.TestHossSanity
   [junit4]   2> 0INFO  
(TEST-TestHossSanity.testStopAllStartAll-seed#[6BEDD90DB0D4DC38]) [] 
o.a.s.c.MiniSolrCloudCluster Starting cluster of 5 servers in 
/home/hossman/lucene/dev/solr/build/solr-core/test/J0/temp/solr.cloud.TestHossSanity_6BEDD90DB0D4DC38-001/tempDir-002
   [junit4]   2> 11   INFO  
(TEST-TestHossSanity.testStopAllStartAll-seed#[6BEDD90DB0D4DC38]) [] 
o.a.s.c.ZkTestServer STARTING ZK TEST SERVER


...first test (testStopAllStartAll) proceeds and runs fine...


   [junit4] OK  31.2s | TestHossSanity.testStopAllStartAll
   [junit4]   2> 30989 INFO  
(TEST-TestHossSanity.testCollectionCreateWithoutCoresThenDelete-seed#[6BEDD90DB0D4DC38])
 [] o.a.s.c.MiniSolrCloudCluster Starting cluster of 5 servers in 
/home/hossman/lucene/dev/solr/build/solr-core/test/J0/temp/solr.cloud.TestHossSanity_6BEDD90DB0D4DC38-001/tempDir-004
   [junit4]   2> 30989 INFO  
(TEST-TestHossSanity.testCollectionCreateWithoutCoresThenDelete-seed#[6BEDD90DB0D4DC38])
 [] o.a.s.c.ZkTestServer STARTING ZK TEST SERVER
...
   [junit4]   2> 30989 INFO  (Thread-100) [] o.a.s.c.ZkTestServer client 
port:0.0.0.0/0.0.0.0:0
   [junit4]   2> 30990 INFO  (Thread-100) [] o.a.s.c.ZkTestServer Starting 
server
   [junit4]   2> 30995 INFO  (pool-11-thread-1) [] o.a.k.k.k.s.r.KdcRequest 
Client entry is empty.
   [junit4]   2> 30995 INFO  (pool-11-thread-1) [

[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression

2017-02-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-10156:
--
Description: 
The significantTerms Streaming Expression will emit a set of terms from a *text 
field* within a doc frequency range for a specific query. It will also score 
the terms based on how many times the terms appear in the result set, and how 
many times the terms appear in the corpus, and return the top N terms based on 
this significance score.

Syntax:

{code}
significantTerms(collection, 
   q="any query", 
   field="some_text_field", 
   minDocFreq="5",   //optional default is 5 documents
   maxDocFreq=".3", // optional default is no more then 
30% of the index (.3)
   minTermlength="4",  // optional default is 4
   limit="50")// optional default is 20
{code}




  was:
The significantTerms Streaming Expression will emit a set of terms from a *text 
field* within a doc frequency range for a specific query. It will also score 
the terms based on how many times the terms appear in the result set, and how 
many times the terms appear in the corpus, and return the top N terms based on 
this significance score.

Syntax:

{code}
significantTerms(collection, 
   q="any query", 
   field="some_text_field", 
   minDocFreq="5", 
   maxDocFreq=".3",
   minTermlength="4",
   limit="50")
{code}





> Add significantTerms Streaming Expression
> -
>
> Key: SOLR-10156
> URL: https://issues.apache.org/jira/browse/SOLR-10156
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5
>
> Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch
>
>
> The significantTerms Streaming Expression will emit a set of terms from a 
> *text field* within a doc frequency range for a specific query. It will also 
> score the terms based on how many times the terms appear in the result set, 
> and how many times the terms appear in the corpus, and return the top N terms 
> based on this significance score.
> Syntax:
> {code}
> significantTerms(collection, 
>q="any query", 
>field="some_text_field", 
>minDocFreq="5",   //optional default is 5 documents
>maxDocFreq=".3", // optional default is no more 
> then 30% of the index (.3)
>minTermlength="4",  // optional default is 4
>limit="50")// optional default is 
> 20
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression

2017-02-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-10156:
--
Description: 
The significantTerms Streaming Expression will emit a set of terms from a *text 
field* within a doc frequency range for a specific query. It will also score 
the terms based on how many times the terms appear in the result set, and how 
many times the terms appear in the corpus, and return the top N terms based on 
this significance score.

Syntax:

{code}
significantTerms(collection, 
 q="any query", 
 field="some_text_field", 
 minDocFreq="5",   //optional default is 5 documents
 maxDocFreq=".3", // optional default is no more then 30% of 
the index (.3)
 minTermlength="4",  // optional default is 4
 limit="50")// optional default is 20
{code}




  was:
The significantTerms Streaming Expression will emit a set of terms from a *text 
field* within a doc frequency range for a specific query. It will also score 
the terms based on how many times the terms appear in the result set, and how 
many times the terms appear in the corpus, and return the top N terms based on 
this significance score.

Syntax:

{code}
significantTerms(collection, 
q="any query", 
   field="some_text_field", 
   minDocFreq="5",   //optional default is 5 documents
   maxDocFreq=".3", // optional default is no more then 
30% of the index (.3)
   minTermlength="4",  // optional default is 4
   limit="50")// optional default is 20
{code}





> Add significantTerms Streaming Expression
> -
>
> Key: SOLR-10156
> URL: https://issues.apache.org/jira/browse/SOLR-10156
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5
>
> Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch
>
>
> The significantTerms Streaming Expression will emit a set of terms from a 
> *text field* within a doc frequency range for a specific query. It will also 
> score the terms based on how many times the terms appear in the result set, 
> and how many times the terms appear in the corpus, and return the top N terms 
> based on this significance score.
> Syntax:
> {code}
> significantTerms(collection, 
>  q="any query", 
>  field="some_text_field", 
>  minDocFreq="5",   //optional default is 5 documents
>  maxDocFreq=".3", // optional default is no more then 30% of 
> the index (.3)
>  minTermlength="4",  // optional default is 4
>  limit="50")// optional default is 20
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression

2017-02-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-10156:
--
Description: 
The significantTerms Streaming Expression will emit a set of terms from a *text 
field* within a doc frequency range for a specific query. It will also score 
the terms based on how many times the terms appear in the result set, and how 
many times the terms appear in the corpus, and return the top N terms based on 
this significance score.

Syntax:

{code}
significantTerms(collection, 
q="any query", 
   field="some_text_field", 
   minDocFreq="5",   //optional default is 5 documents
   maxDocFreq=".3", // optional default is no more then 
30% of the index (.3)
   minTermlength="4",  // optional default is 4
   limit="50")// optional default is 20
{code}




  was:
The significantTerms Streaming Expression will emit a set of terms from a *text 
field* within a doc frequency range for a specific query. It will also score 
the terms based on how many times the terms appear in the result set, and how 
many times the terms appear in the corpus, and return the top N terms based on 
this significance score.

Syntax:

{code}
significantTerms(collection, 
  q="any query", 
   field="some_text_field", 
   minDocFreq="5",   //optional default is 5 documents
   maxDocFreq=".3", // optional default is no more then 
30% of the index (.3)
   minTermlength="4",  // optional default is 4
   limit="50")// optional default is 20
{code}





> Add significantTerms Streaming Expression
> -
>
> Key: SOLR-10156
> URL: https://issues.apache.org/jira/browse/SOLR-10156
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5
>
> Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch
>
>
> The significantTerms Streaming Expression will emit a set of terms from a 
> *text field* within a doc frequency range for a specific query. It will also 
> score the terms based on how many times the terms appear in the result set, 
> and how many times the terms appear in the corpus, and return the top N terms 
> based on this significance score.
> Syntax:
> {code}
> significantTerms(collection, 
> q="any query", 
>field="some_text_field", 
>minDocFreq="5",   //optional default is 5 documents
>maxDocFreq=".3", // optional default is no more 
> then 30% of the index (.3)
>minTermlength="4",  // optional default is 4
>limit="50")// optional default is 
> 20
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression

2017-02-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-10156:
--
Description: 
The significantTerms Streaming Expression will emit a set of terms from a *text 
field* within a doc frequency range for a specific query. It will also score 
the terms based on how many times the terms appear in the result set, and how 
many times the terms appear in the corpus, and return the top N terms based on 
this significance score.

Syntax:

{code}
significantTerms(collection, 
  q="any query", 
   field="some_text_field", 
   minDocFreq="5",   //optional default is 5 documents
   maxDocFreq=".3", // optional default is no more then 
30% of the index (.3)
   minTermlength="4",  // optional default is 4
   limit="50")// optional default is 20
{code}




  was:
The significantTerms Streaming Expression will emit a set of terms from a *text 
field* within a doc frequency range for a specific query. It will also score 
the terms based on how many times the terms appear in the result set, and how 
many times the terms appear in the corpus, and return the top N terms based on 
this significance score.

Syntax:

{code}
significantTerms(collection, 
   q="any query", 
   field="some_text_field", 
   minDocFreq="5",   //optional default is 5 documents
   maxDocFreq=".3", // optional default is no more then 
30% of the index (.3)
   minTermlength="4",  // optional default is 4
   limit="50")// optional default is 20
{code}





> Add significantTerms Streaming Expression
> -
>
> Key: SOLR-10156
> URL: https://issues.apache.org/jira/browse/SOLR-10156
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5
>
> Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch
>
>
> The significantTerms Streaming Expression will emit a set of terms from a 
> *text field* within a doc frequency range for a specific query. It will also 
> score the terms based on how many times the terms appear in the result set, 
> and how many times the terms appear in the corpus, and return the top N terms 
> based on this significance score.
> Syntax:
> {code}
> significantTerms(collection, 
>   q="any query", 
>field="some_text_field", 
>minDocFreq="5",   //optional default is 5 documents
>maxDocFreq=".3", // optional default is no more 
> then 30% of the index (.3)
>minTermlength="4",  // optional default is 4
>limit="50")// optional default is 
> 20
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression

2017-02-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-10156:
--
Description: 
The significantTerms Streaming Expression will emit a set of terms from a *text 
field* within a doc frequency range for a specific query. It will also score 
the terms based on how many times the terms appear in the result set, and how 
many times the terms appear in the corpus, and return the top N terms based on 
this significance score.

Syntax:

{code}
significantTerms(collection, 
   q="any query", 
   field="some_text_field", 
   minDocFreq="5", 
   maxDocFreq=".3",
   minTermlength="4",
   limit="50")
{code}




  was:
The significantTerms Streaming Expression will emit a set of terms from a *text 
field* within a doc frequency range for a specific query. It will also score 
the terms based on how many times the terms appear in the result set, and how 
many times the terms appear in the corpus, and return the top N terms based on 
this significance score.

Syntax:

{code}
significantTerms(collection, 
   q="abc", 
   field="some_text_field", 
   minDocFreq="x", 
   maxDocFreq="y",
   limit="50")
{code}


> Add significantTerms Streaming Expression
> -
>
> Key: SOLR-10156
> URL: https://issues.apache.org/jira/browse/SOLR-10156
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5
>
> Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch
>
>
> The significantTerms Streaming Expression will emit a set of terms from a 
> *text field* within a doc frequency range for a specific query. It will also 
> score the terms based on how many times the terms appear in the result set, 
> and how many times the terms appear in the corpus, and return the top N terms 
> based on this significance score.
> Syntax:
> {code}
> significantTerms(collection, 
>q="any query", 
>field="some_text_field", 
>minDocFreq="5", 
>maxDocFreq=".3",
>minTermlength="4",
>limit="50")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7705) Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the max token length

2017-02-23 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881610#comment-15881610
 ] 

Erick Erickson commented on LUCENE-7705:


These two tests fail:
org.apache.lucene.analysis.core.TestUnicodeWhitespaceTokenizer.testParamsFactory
org.apache.lucene.analysis.core.TestRandomChains (suite)


TestUnicideWhitespaceTokenizer is because I added "to" to one of the exception 
messages, a trivial fix

No idea what's happening with TestRandomChains though.

And my nocommit insures that 'ant precommit' will fail too, that's rather the 
point.

> Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the 
> max token length
> -
>
> Key: LUCENE-7705
> URL: https://issues.apache.org/jira/browse/LUCENE-7705
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Amrit Sarkar
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: LUCENE-7705.patch
>
>
> SOLR-10186
> [~erickerickson]: Is there a good reason that we hard-code a 256 character 
> limit for the CharTokenizer? In order to change this limit it requires that 
> people copy/paste the incrementToken into some new class since incrementToken 
> is final.
> KeywordTokenizer can easily change the default (which is also 256 bytes), but 
> to do so requires code rather than being able to configure it in the schema.
> For KeywordTokenizer, this is Solr-only. For the CharTokenizer classes 
> (WhitespaceTokenizer, UnicodeWhitespaceTokenizer and LetterTokenizer) 
> (Factories) it would take adding a c'tor to the base class in Lucene and 
> using it in the factory.
> Any objections?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-10186) Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the max token length

2017-02-23 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-10186:
-

Assignee: Erick Erickson

> Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the 
> max token length
> -
>
> Key: SOLR-10186
> URL: https://issues.apache.org/jira/browse/SOLR-10186
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: SOLR-10186.patch, SOLR-10186.patch, SOLR-10186.patch
>
>
> Is there a good reason that we hard-code a 256 character limit for the 
> CharTokenizer? In order to change this limit it requires that people 
> copy/paste the incrementToken into some new class since incrementToken is 
> final.
> KeywordTokenizer can easily change the default (which is also 256 bytes), but 
> to do so requires code rather than being able to configure it in the schema.
> For KeywordTokenizer, this is Solr-only. For the CharTokenizer classes 
> (WhitespaceTokenizer, UnicodeWhitespaceTokenizer and LetterTokenizer) 
> (Factories) it would take adding a c'tor to the base class in Lucene and 
> using it in the factory.
> Any objections?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-7705) Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the max token length

2017-02-23 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned LUCENE-7705:
--

Assignee: Erick Erickson

> Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the 
> max token length
> -
>
> Key: LUCENE-7705
> URL: https://issues.apache.org/jira/browse/LUCENE-7705
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Amrit Sarkar
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: LUCENE-7705.patch
>
>
> SOLR-10186
> [~erickerickson]: Is there a good reason that we hard-code a 256 character 
> limit for the CharTokenizer? In order to change this limit it requires that 
> people copy/paste the incrementToken into some new class since incrementToken 
> is final.
> KeywordTokenizer can easily change the default (which is also 256 bytes), but 
> to do so requires code rather than being able to configure it in the schema.
> For KeywordTokenizer, this is Solr-only. For the CharTokenizer classes 
> (WhitespaceTokenizer, UnicodeWhitespaceTokenizer and LetterTokenizer) 
> (Factories) it would take adding a c'tor to the base class in Lucene and 
> using it in the factory.
> Any objections?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10194) Unable to use the UninvertedField implementation with legacy facets

2017-02-23 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881542#comment-15881542
 ] 

Shawn Heisey commented on SOLR-10194:
-

This could be part of an issue where the description is just "Solr 6.x 
performance is much worse than Solr 4.x performance."  This statement is 
particularly true when facets (and probably grouping) are involved.  For the 
person who filed this issue (who I have been talking to via IRC), enabling 
docValues and reindexing makes performance worse, not better.

> Unable to use the UninvertedField implementation with legacy facets
> ---
>
> Key: SOLR-10194
> URL: https://issues.apache.org/jira/browse/SOLR-10194
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 6.2, 6.3, 6.4.1
> Environment: Linux
>Reporter: Victor Igumnov
>Priority: Minor
>  Labels: easyfix
>
> FacetComponent's method "modifyRequestForFieldFacets" modifies the 
> distributed facet request and sets the mincount count to zero which then the 
> SimpleFacets implementation is unable to get into the UIF code block when 
> facet.method=uif is applied. The workaround which I found is to use 
> facet.distrib.mco=true which sets the mincount to one instead of zero. 
> Working:
> http://somehost:9100/solr/collection/select?facet.method=uif=attribute=*:*=true=true=true
>  
> None-Working:
> http://somehost:9100/solr/collection/select?facet.method=uif=attribute=*:*=true=true=false
> Semi-working when it isn't a distributed call:
> http://somehost:9100/solr/collection/select?facet.method=uif=attribute=*:*=true=true=false=false
> Just make sure to run it on a multi-shard setup. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7705) Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the max token length

2017-02-23 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881459#comment-15881459
 ] 

Erick Erickson commented on LUCENE-7705:


I think the patch I uploaded is a result of applying your most recent patch for 
SOLR-10186 but can you verify? We should probably consolidate two, I suggest we 
close the Solr one as a duplicate and continue iterating here.

Erick

> Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the 
> max token length
> -
>
> Key: LUCENE-7705
> URL: https://issues.apache.org/jira/browse/LUCENE-7705
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Amrit Sarkar
>Priority: Minor
> Attachments: LUCENE-7705.patch
>
>
> SOLR-10186
> [~erickerickson]: Is there a good reason that we hard-code a 256 character 
> limit for the CharTokenizer? In order to change this limit it requires that 
> people copy/paste the incrementToken into some new class since incrementToken 
> is final.
> KeywordTokenizer can easily change the default (which is also 256 bytes), but 
> to do so requires code rather than being able to configure it in the schema.
> For KeywordTokenizer, this is Solr-only. For the CharTokenizer classes 
> (WhitespaceTokenizer, UnicodeWhitespaceTokenizer and LetterTokenizer) 
> (Factories) it would take adding a c'tor to the base class in Lucene and 
> using it in the factory.
> Any objections?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-6.x-Windows (64bit/jdk1.8.0_121) - Build # 747 - Unstable!

2017-02-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Windows/747/
Java: 64bit/jdk1.8.0_121 -XX:-UseCompressedOops -XX:+UseG1GC

2 tests failed.
FAILED:  org.apache.solr.cloud.OverseerRolesTest.testOverseerRole

Error Message:
Timed out waiting for overseer state change

Stack Trace:
java.lang.AssertionError: Timed out waiting for overseer state change
at 
__randomizedtesting.SeedInfo.seed([742460D0CAF956E6:95EF9D44F14A6037]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.OverseerRolesTest.waitForNewOverseer(OverseerRolesTest.java:62)
at 
org.apache.solr.cloud.OverseerRolesTest.testOverseerRole(OverseerRolesTest.java:140)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:745)


FAILED:  org.apache.solr.cloud.ShardSplitTest.testSplitWithChaosMonkey

Error Message:
There are still nodes recoverying - waited for 330 seconds

Stack Trace:
java.lang.AssertionError: There are still nodes recoverying - waited for 330 

[jira] [Updated] (LUCENE-7705) Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the max token length

2017-02-23 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-7705:
---
Attachment: LUCENE-7705.patch

Patch that fixes up a few comments, regularized maxChars* to maxToken* and the 
like. I enhanced a test to test tokens longer than 256 characters.

There was a problem with LowerCaseTokenizerFactory, the getMultiTermComponent 
method constructed a LowerCaseFilterFactory with the _original_ arguments 
including maxTokenLen, which then threw an error. There's a nocommit in there 
for the nonce, what's the right thing to do here?

[~amrit sarkar] Do you have any ideas for a more elegant solution? The nocommit 
is there because this is feels just too hacky, but it does prove that this is 
the problem.

It seems like we should close SOLR-10186 and just make the code changes here. 
With this patch I successfully tested adding fields with tokens longer than 256 
and shorter, so I don't think there's anything beyond this patch to do with 
Solr. I suppose we could add some maxTokenLen bits to some of the schemas just 
to exercise that (which would have found the LowerCaseTokenizerFactory bit).

> Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the 
> max token length
> -
>
> Key: LUCENE-7705
> URL: https://issues.apache.org/jira/browse/LUCENE-7705
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Amrit Sarkar
>Priority: Minor
> Attachments: LUCENE-7705.patch
>
>
> SOLR-10186
> [~erickerickson]: Is there a good reason that we hard-code a 256 character 
> limit for the CharTokenizer? In order to change this limit it requires that 
> people copy/paste the incrementToken into some new class since incrementToken 
> is final.
> KeywordTokenizer can easily change the default (which is also 256 bytes), but 
> to do so requires code rather than being able to configure it in the schema.
> For KeywordTokenizer, this is Solr-only. For the CharTokenizer classes 
> (WhitespaceTokenizer, UnicodeWhitespaceTokenizer and LetterTokenizer) 
> (Factories) it would take adding a c'tor to the base class in Lucene and 
> using it in the factory.
> Any objections?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-master - Build # 1691 - Still Unstable

2017-02-23 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-master/1691/

1 tests failed.
FAILED:  
org.apache.solr.cloud.CdcrBootstrapTest.testConvertClusterToCdcrAndBootstrap

Error Message:
Document mismatch on target after sync expected:<1000> but was:<0>

Stack Trace:
java.lang.AssertionError: Document mismatch on target after sync 
expected:<1000> but was:<0>
at 
__randomizedtesting.SeedInfo.seed([1375753A57F29F1F:C4A25A4DE3AD0758]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.solr.cloud.CdcrBootstrapTest.testConvertClusterToCdcrAndBootstrap(CdcrBootstrapTest.java:134)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 12445 lines...]
   [junit4] Suite: org.apache.solr.cloud.CdcrBootstrapTest
   [junit4]   2> Creating 

[jira] [Commented] (SOLR-9764) Design a memory efficient DocSet if a query returns all docs

2017-02-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881362#comment-15881362
 ] 

ASF subversion and git services commented on SOLR-9764:
---

Commit 92e619260cc89b4725c2e5e971fc3cb7bbb339cc in lucene-solr's branch 
refs/heads/branch_6x from [~yo...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=92e6192 ]

SOLR-9764: fix CHANGES entry


> Design a memory efficient DocSet if a query returns all docs
> 
>
> Key: SOLR-9764
> URL: https://issues.apache.org/jira/browse/SOLR-9764
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Michael Sun
>Assignee: Yonik Seeley
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR_9764_no_cloneMe.patch, SOLR-9764.patch, 
> SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, 
> SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch
>
>
> In some use cases, particularly use cases with time series data, using 
> collection alias and partitioning data into multiple small collections using 
> timestamp, a filter query can match all documents in a collection. Currently 
> BitDocSet is used which contains a large array of long integers with every 
> bits set to 1. After querying, the resulted DocSet saved in filter cache is 
> large and becomes one of the main memory consumers in these use cases.
> For example. suppose a Solr setup has 14 collections for data in last 14 
> days, each collection with one day of data. A filter query for last one week 
> data would result in at least six DocSet in filter cache which matches all 
> documents in six collections respectively.   
> This is to design a new DocSet that is memory efficient for such a use case.  
> The new DocSet removes the large array, reduces memory usage and GC pressure 
> without losing advantage of large filter cache.
> In particular, for use cases when using time series data, collection alias 
> and partition data into multiple small collections using timestamp, the gain 
> can be large.
> For further optimization, it may be helpful to design a DocSet with run 
> length encoding. Thanks [~mmokhtar] for suggestion. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9764) Design a memory efficient DocSet if a query returns all docs

2017-02-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881361#comment-15881361
 ] 

ASF subversion and git services commented on SOLR-9764:
---

Commit 05c17c9a516d8501b2dcce9b5910a3d0b5510bc4 in lucene-solr's branch 
refs/heads/master from [~yo...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=05c17c9 ]

SOLR-9764: fix CHANGES entry


> Design a memory efficient DocSet if a query returns all docs
> 
>
> Key: SOLR-9764
> URL: https://issues.apache.org/jira/browse/SOLR-9764
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Michael Sun
>Assignee: Yonik Seeley
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR_9764_no_cloneMe.patch, SOLR-9764.patch, 
> SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, 
> SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch
>
>
> In some use cases, particularly use cases with time series data, using 
> collection alias and partitioning data into multiple small collections using 
> timestamp, a filter query can match all documents in a collection. Currently 
> BitDocSet is used which contains a large array of long integers with every 
> bits set to 1. After querying, the resulted DocSet saved in filter cache is 
> large and becomes one of the main memory consumers in these use cases.
> For example. suppose a Solr setup has 14 collections for data in last 14 
> days, each collection with one day of data. A filter query for last one week 
> data would result in at least six DocSet in filter cache which matches all 
> documents in six collections respectively.   
> This is to design a new DocSet that is memory efficient for such a use case.  
> The new DocSet removes the large array, reduces memory usage and GC pressure 
> without losing advantage of large filter cache.
> In particular, for use cases when using time series data, collection alias 
> and partition data into multiple small collections using timestamp, the gain 
> can be large.
> For further optimization, it may be helpful to design a DocSet with run 
> length encoding. Thanks [~mmokhtar] for suggestion. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9764) Design a memory efficient DocSet if a query returns all docs

2017-02-23 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881330#comment-15881330
 ] 

Yonik Seeley commented on SOLR-9764:


Hmmm, yep.  I'll fix...

> Design a memory efficient DocSet if a query returns all docs
> 
>
> Key: SOLR-9764
> URL: https://issues.apache.org/jira/browse/SOLR-9764
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Michael Sun
>Assignee: Yonik Seeley
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR_9764_no_cloneMe.patch, SOLR-9764.patch, 
> SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, 
> SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch
>
>
> In some use cases, particularly use cases with time series data, using 
> collection alias and partitioning data into multiple small collections using 
> timestamp, a filter query can match all documents in a collection. Currently 
> BitDocSet is used which contains a large array of long integers with every 
> bits set to 1. After querying, the resulted DocSet saved in filter cache is 
> large and becomes one of the main memory consumers in these use cases.
> For example. suppose a Solr setup has 14 collections for data in last 14 
> days, each collection with one day of data. A filter query for last one week 
> data would result in at least six DocSet in filter cache which matches all 
> documents in six collections respectively.   
> This is to design a new DocSet that is memory efficient for such a use case.  
> The new DocSet removes the large array, reduces memory usage and GC pressure 
> without losing advantage of large filter cache.
> In particular, for use cases when using time series data, collection alias 
> and partition data into multiple small collections using timestamp, the gain 
> can be large.
> For further optimization, it may be helpful to design a DocSet with run 
> length encoding. Thanks [~mmokhtar] for suggestion. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-6.x-Linux (32bit/jdk1.8.0_121) - Build # 2927 - Unstable!

2017-02-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/2927/
Java: 32bit/jdk1.8.0_121 -server -XX:+UseParallelGC

1 tests failed.
FAILED:  org.apache.solr.update.TestInPlaceUpdatesDistrib.test

Error Message:
'sanitycheck' results against client: 
org.apache.solr.client.solrj.impl.HttpSolrClient@14e7321 (not leader) wrong 
[docid] for SolrDocument{id=180, 
id_field_copy_that_does_not_support_in_place_update_s=180, title_s=title180, 
id_i=180, inplace_updatable_float=101.0, _version_=1560158401798864896, 
inplace_updatable_int_with_default=666, 
inplace_updatable_float_with_default=42.0, [docid]=970} expected:<658> but 
was:<970>

Stack Trace:
java.lang.AssertionError: 'sanitycheck' results against client: 
org.apache.solr.client.solrj.impl.HttpSolrClient@14e7321 (not leader) wrong 
[docid] for SolrDocument{id=180, 
id_field_copy_that_does_not_support_in_place_update_s=180, title_s=title180, 
id_i=180, inplace_updatable_float=101.0, _version_=1560158401798864896, 
inplace_updatable_int_with_default=666, 
inplace_updatable_float_with_default=42.0, [docid]=970} expected:<658> but 
was:<970>
at 
__randomizedtesting.SeedInfo.seed([E6685467432DBE18:6E3C6BBDEDD1D3E0]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at 
org.apache.solr.update.TestInPlaceUpdatesDistrib.assertDocIdsAndValuesInResults(TestInPlaceUpdatesDistrib.java:442)
at 
org.apache.solr.update.TestInPlaceUpdatesDistrib.assertDocIdsAndValuesAgainstAllClients(TestInPlaceUpdatesDistrib.java:413)
at 
org.apache.solr.update.TestInPlaceUpdatesDistrib.docValuesUpdateTest(TestInPlaceUpdatesDistrib.java:321)
at 
org.apache.solr.update.TestInPlaceUpdatesDistrib.test(TestInPlaceUpdatesDistrib.java:140)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:992)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:967)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 

[JENKINS] Lucene-Solr-Tests-6.x - Build # 749 - Unstable

2017-02-23 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-6.x/749/

1 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.security.hadoop.TestSolrCloudWithHadoopAuthPlugin

Error Message:
Address already in use

Stack Trace:
java.net.BindException: Address already in use
at __randomizedtesting.SeedInfo.seed([D107F67B9D7C17EA]:0)
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at 
org.apache.mina.transport.socket.nio.NioSocketAcceptor.open(NioSocketAcceptor.java:252)
at 
org.apache.mina.transport.socket.nio.NioSocketAcceptor.open(NioSocketAcceptor.java:49)
at 
org.apache.mina.core.polling.AbstractPollingIoAcceptor.registerHandles(AbstractPollingIoAcceptor.java:525)
at 
org.apache.mina.core.polling.AbstractPollingIoAcceptor.access$200(AbstractPollingIoAcceptor.java:67)
at 
org.apache.mina.core.polling.AbstractPollingIoAcceptor$Acceptor.run(AbstractPollingIoAcceptor.java:409)
at 
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:65)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 11778 lines...]
   [junit4] Suite: 
org.apache.solr.security.hadoop.TestSolrCloudWithHadoopAuthPlugin
   [junit4]   2> Creating dataDir: 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-6.x/solr/build/solr-core/test/J0/temp/solr.security.hadoop.TestSolrCloudWithHadoopAuthPlugin_D107F67B9D7C17EA-001/init-core-data-001
   [junit4]   2> 651059 INFO  
(SUITE-TestSolrCloudWithHadoopAuthPlugin-seed#[D107F67B9D7C17EA]-worker) [] 
o.a.s.SolrTestCaseJ4 Using TrieFields
   [junit4]   2> 651061 INFO  
(SUITE-TestSolrCloudWithHadoopAuthPlugin-seed#[D107F67B9D7C17EA]-worker) [] 
o.a.s.SolrTestCaseJ4 Randomized ssl (false) and clientAuth (true) via: 
@org.apache.solr.util.RandomizeSSL(reason=, ssl=NaN, value=NaN, clientAuth=NaN)
   [junit4]   2> 655245 WARN  
(SUITE-TestSolrCloudWithHadoopAuthPlugin-seed#[D107F67B9D7C17EA]-worker) [] 
o.a.d.s.c.DefaultDirectoryService You didn't change the admin password of 
directory service instance 'DefaultKrbServer'.  Please update the admin 
password as soon as possible to prevent a possible security breach.
   [junit4]   2> 655959 INFO  
(SUITE-TestSolrCloudWithHadoopAuthPlugin-seed#[D107F67B9D7C17EA]-worker) [] 
o.a.s.SolrTestCaseJ4 ###deleteCore
   [junit4]   2> 655959 INFO  
(SUITE-TestSolrCloudWithHadoopAuthPlugin-seed#[D107F67B9D7C17EA]-worker) [] 
o.a.s.SolrTestCaseJ4 --- 
Done waiting for all SolrIndexSearchers to be released
   [junit4]   2> 655959 INFO  
(SUITE-TestSolrCloudWithHadoopAuthPlugin-seed#[D107F67B9D7C17EA]-worker) [] 
o.a.s.SolrTestCaseJ4 --- 
Done waiting for tracked resources to be released
   [junit4]   2> NOTE: test params are: codec=Lucene62, 
sim=RandomSimilarity(queryNorm=false,coord=crazy): {}, locale=es-AR, 
timezone=Africa/Casablanca
   [junit4]   2> NOTE: Linux 3.13.0-85-generic amd64/Oracle Corporation 
1.8.0_121 (64-bit)/cpus=4,threads=1,free=215587256,total=531103744
   [junit4]   2> NOTE: All tests run in this JVM: [ScriptEngineTest, 
RuleEngineTest, DocumentAnalysisRequestHandlerTest, ClusterStateTest, 
XsltUpdateRequestHandlerTest, TestSolr4Spatial, 
TestSolrConfigHandlerConcurrent, UtilsToolTest, MergeStrategyTest, 
SubstringBytesRefFilterTest, AssignTest, TestRealTimeGet, 
TestHighlightDedupGrouping, AnalysisErrorHandlingTest, 
SignatureUpdateProcessorFactoryTest, SolrCoreMetricManagerTest, 
TestSolrConfigHandler, TestCollectionAPIs, TestFunctionQuery, 
TestBulkSchemaAPI, DistributedFacetPivotSmallTest, SolrCoreTest, 
TestSchemaSimilarityResource, TestRandomRequestDistribution, 
OverseerCollectionConfigSetProcessorTest, LeaderElectionIntegrationTest, 
TestPHPSerializedResponseWriter, TestCloudManagedSchema, TestSolrCoreParser, 
TestImplicitCoreProperties, TestExactSharedStatsCache, TestNumericTerms64, 
RecoveryAfterSoftCommitTest, TestFilteredDocIdSet, 
TestSchemalessBufferedUpdates, TestStressRecovery, TestSolrCoreProperties, 
TestSolrCloudSnapshots, HdfsCollectionsAPIDistributedZkTest, 
TestNonDefinedSimilarityFactory, AnalysisAfterCoreReloadTest, TestRTimerTree, 
OutputWriterTest, TestAddFieldRealTimeGet, TestCloudPivotFacet, 
TestRecoveryHdfs, TestQuerySenderNoQuery, TestConfigSetImmutable, 
BlockCacheTest, TriLevelCompositeIdRoutingTest, TestIBSimilarityFactory, 
SSLMigrationTest, TestCorePropertiesReload, SolrRequestParserTest, 
TestOmitPositions, ReturnFieldsTest, 

[jira] [Commented] (SOLR-9764) Design a memory efficient DocSet if a query returns all docs

2017-02-23 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881245#comment-15881245
 ] 

Steve Rowe commented on SOLR-9764:
--

@yonik: The original commit on this issue included the following CHANGES entry:

bq. SOLR-9764: All filters that which all documents in the index now share the 
same memory (DocSet).

I think that the "which" in that sentence should instead be "match"?

> Design a memory efficient DocSet if a query returns all docs
> 
>
> Key: SOLR-9764
> URL: https://issues.apache.org/jira/browse/SOLR-9764
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Michael Sun
>Assignee: Yonik Seeley
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR_9764_no_cloneMe.patch, SOLR-9764.patch, 
> SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, 
> SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch
>
>
> In some use cases, particularly use cases with time series data, using 
> collection alias and partitioning data into multiple small collections using 
> timestamp, a filter query can match all documents in a collection. Currently 
> BitDocSet is used which contains a large array of long integers with every 
> bits set to 1. After querying, the resulted DocSet saved in filter cache is 
> large and becomes one of the main memory consumers in these use cases.
> For example. suppose a Solr setup has 14 collections for data in last 14 
> days, each collection with one day of data. A filter query for last one week 
> data would result in at least six DocSet in filter cache which matches all 
> documents in six collections respectively.   
> This is to design a new DocSet that is memory efficient for such a use case.  
> The new DocSet removes the large array, reduces memory usage and GC pressure 
> without losing advantage of large filter cache.
> In particular, for use cases when using time series data, collection alias 
> and partition data into multiple small collections using timestamp, the gain 
> can be large.
> For further optimization, it may be helpful to design a DocSet with run 
> length encoding. Thanks [~mmokhtar] for suggestion. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7705) Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the max token length

2017-02-23 Thread Amrit Sarkar (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881239#comment-15881239
 ] 

Amrit Sarkar commented on LUCENE-7705:
--

I have cooked up a patch in SOLR-10186, and introduced new constructor in 
CharTokenizer and related Tokenizer factories, which takes _maxCharLen_ and 
_factory_ as parameters along with it.

Kindly provide your feedback and any comments on introducing new constructors 
in the classes. Thanks.

> Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the 
> max token length
> -
>
> Key: LUCENE-7705
> URL: https://issues.apache.org/jira/browse/LUCENE-7705
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Amrit Sarkar
>Priority: Minor
>
> SOLR-10186
> [~erickerickson]: Is there a good reason that we hard-code a 256 character 
> limit for the CharTokenizer? In order to change this limit it requires that 
> people copy/paste the incrementToken into some new class since incrementToken 
> is final.
> KeywordTokenizer can easily change the default (which is also 256 bytes), but 
> to do so requires code rather than being able to configure it in the schema.
> For KeywordTokenizer, this is Solr-only. For the CharTokenizer classes 
> (WhitespaceTokenizer, UnicodeWhitespaceTokenizer and LetterTokenizer) 
> (Factories) it would take adding a c'tor to the base class in Lucene and 
> using it in the factory.
> Any objections?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10156) Add significantTerms Streaming Expression

2017-02-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881219#comment-15881219
 ] 

ASF subversion and git services commented on SOLR-10156:


Commit 66fb1f83d64f5c79cedd4876e19a541eba30aed1 in lucene-solr's branch 
refs/heads/branch_6x from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=66fb1f8 ]

SOLR-10156: Increase the overfetch


> Add significantTerms Streaming Expression
> -
>
> Key: SOLR-10156
> URL: https://issues.apache.org/jira/browse/SOLR-10156
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5
>
> Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch
>
>
> The significantTerms Streaming Expression will emit a set of terms from a 
> *text field* within a doc frequency range for a specific query. It will also 
> score the terms based on how many times the terms appear in the result set, 
> and how many times the terms appear in the corpus, and return the top N terms 
> based on this significance score.
> Syntax:
> {code}
> significantTerms(collection, 
>q="abc", 
>field="some_text_field", 
>minDocFreq="x", 
>maxDocFreq="y",
>limit="50")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10156) Add significantTerms Streaming Expression

2017-02-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881218#comment-15881218
 ] 

ASF subversion and git services commented on SOLR-10156:


Commit 744fbde1b6d770caafe0d0a4507fea30d08f8152 in lucene-solr's branch 
refs/heads/branch_6x from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=744fbde ]

SOLR-10156: Add significantTerms Streaming Expression


> Add significantTerms Streaming Expression
> -
>
> Key: SOLR-10156
> URL: https://issues.apache.org/jira/browse/SOLR-10156
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5
>
> Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch
>
>
> The significantTerms Streaming Expression will emit a set of terms from a 
> *text field* within a doc frequency range for a specific query. It will also 
> score the terms based on how many times the terms appear in the result set, 
> and how many times the terms appear in the corpus, and return the top N terms 
> based on this significance score.
> Syntax:
> {code}
> significantTerms(collection, 
>q="abc", 
>field="some_text_field", 
>minDocFreq="x", 
>maxDocFreq="y",
>limit="50")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-7707:

Attachment: LUCENE-7707.patch

fix typo s/loosing/losing

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch, 
> LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881181#comment-15881181
 ] 

Michael McCandless commented on LUCENE-7707:


+1, thanks [~simonw].

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch, 
> LUCENE-7707.patch, LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-7707:

Attachment: LUCENE-7707.patch

updated javadocs

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch, 
> LUCENE-7707.patch, LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-10198) EmbeddedSolrServer embedded behavior different from HttpSolrClient

2017-02-23 Thread Bert Summers (JIRA)
Bert Summers created SOLR-10198:
---

 Summary: EmbeddedSolrServer embedded behavior different from 
HttpSolrClient
 Key: SOLR-10198
 URL: https://issues.apache.org/jira/browse/SOLR-10198
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrJ
Affects Versions: 6.4.1
Reporter: Bert Summers


When retrieving the value of a field the object type is different depending on 
the server type.

If I have a schema which has 

If I do
solrClient.queryAndStreamResponse("test", new SolrQuery("*:*"), new  
StreamingResponseCallback {

@Override
public void streamSolrDocument(final SolrDocument doc) {
Object idField = doc.getFieldValue("id");
}

@Override
public void streamDocListInfo(final long numFound, final long start, 
final Float maxScore) {
System.out.println("Found " + numFound + " documents");
}
});


in streamSolrDocument the Object type is Integer if the server is http but 
StoredField if embedded.

Both the server and embedded use the same schema.xml and solrconfig.xml

In version 5.1.0 both connections would return the same type (Integer)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10156) Add significantTerms Streaming Expression

2017-02-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881145#comment-15881145
 ] 

ASF subversion and git services commented on SOLR-10156:


Commit a0aef2faaf7da56efc8ac4b004e9d3b8dc401e81 in lucene-solr's branch 
refs/heads/master from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a0aef2f ]

SOLR-10156: Increase the overfetch


> Add significantTerms Streaming Expression
> -
>
> Key: SOLR-10156
> URL: https://issues.apache.org/jira/browse/SOLR-10156
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5
>
> Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch
>
>
> The significantTerms Streaming Expression will emit a set of terms from a 
> *text field* within a doc frequency range for a specific query. It will also 
> score the terms based on how many times the terms appear in the result set, 
> and how many times the terms appear in the corpus, and return the top N terms 
> based on this significance score.
> Syntax:
> {code}
> significantTerms(collection, 
>q="abc", 
>field="some_text_field", 
>minDocFreq="x", 
>maxDocFreq="y",
>limit="50")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10156) Add significantTerms Streaming Expression

2017-02-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881144#comment-15881144
 ] 

ASF subversion and git services commented on SOLR-10156:


Commit dba733e7aa90bd607fdda0342b94bc17bb717c31 in lucene-solr's branch 
refs/heads/master from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=dba733e ]

SOLR-10156: Add significantTerms Streaming Expression


> Add significantTerms Streaming Expression
> -
>
> Key: SOLR-10156
> URL: https://issues.apache.org/jira/browse/SOLR-10156
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5
>
> Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch
>
>
> The significantTerms Streaming Expression will emit a set of terms from a 
> *text field* within a doc frequency range for a specific query. It will also 
> score the terms based on how many times the terms appear in the result set, 
> and how many times the terms appear in the corpus, and return the top N terms 
> based on this significance score.
> Syntax:
> {code}
> significantTerms(collection, 
>q="abc", 
>field="some_text_field", 
>minDocFreq="x", 
>maxDocFreq="y",
>limit="50")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881132#comment-15881132
 ] 

Michael McCandless commented on LUCENE-7707:


+1, looks awesome!

Maybe update the javadocs to explain that we will either fill in the shardIndex 
or will not but never both.

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch, 
> LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-7707:

Attachment: LUCENE-7707.patch

s/where/were

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch, 
> LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-7707:

Attachment: LUCENE-7707.patch

here is a new patch adding more safety to it and making the decision up-front 
if we assign shardIndex or not

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch, LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881076#comment-15881076
 ] 

Simon Willnauer commented on LUCENE-7707:
-

bq. Maybe we could require that either all incoming shardIndex are undefined, 
or all are set, but you are not allowed to mix?

I think this is what we should ultimately do. I don't see a different way than 
peaking at the at the TopDocs so see if it's preset and then executed based on 
that. I can certainly add assertions...

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9887) Add KeepWordFilter, StemmerOverrideFilter, StopFilterFactory, SynonymFilter that reads data from a JDBC source

2017-02-23 Thread Christine Poerschke (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881013#comment-15881013
 ] 

Christine Poerschke commented on SOLR-9887:
---

(Late to the party here.)

I think support for stop words, synonyms, etc. from sources other than text 
files would be a useful feature for Solr and using streaming expressions to 
'abstract away' the source of the stop words sounds like a good generalisation.

What might a preferred and suitable approach be to take this forward? In no 
particular order:

* Option 1: config-to-code
** starting with the existing config e.g. {{}} 
work out and sketch out what the new streaming expressions based configuration 
will look like
** coding up of that solution

* Option 2: build-upon-existing
** creation of a pull request against lucene-solr based upon 
https://github.com/shopping24/solr-jdbc as per above
** transformation of that pull request into streaming expressions based approach

* Option 3: 
** 

(From my very positive and collaborative experiences on SOLR-5730 and SOLR-8621 
my preference/recommendation would probably be 'Option 1' rather than 'Option 
2' and I'd be very interested to hear what Option 3, 4, etc. might be also.)

> Add KeepWordFilter, StemmerOverrideFilter, StopFilterFactory, SynonymFilter 
> that reads data from a JDBC source
> --
>
> Key: SOLR-9887
> URL: https://issues.apache.org/jira/browse/SOLR-9887
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tobias Kässmann
>Priority: Minor
>
> We've created some new {{FilterFactories}} that reads their stopwords or 
> synonyms from a database (by a JDBC source). That enables us a easy 
> management of large lists and also add the possibility to do this in other 
> tools. JDBC data sources are retrieved via JNDI.
> For a easy reload of this lists we've added a {{SeacherAwareReloader}} 
> abstraciton that reloads this lists on every new searcher event.
> If this is a feature that is interesting for Solr, we will create a pull 
> request. All the sources are currently available here: 
> https://github.com/shopping24/solr-jdbc



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-23 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881006#comment-15881006
 ] 

Ishan Chattopadhyaya commented on SOLR-10130:
-

Adding a link to https://issues.apache.org/jira/browse/SOLR-10182 for backing 
out the changes that caused these perf degradations.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10155) Clarify logic for term filters on numeric types

2017-02-23 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880996#comment-15880996
 ] 

Adrien Grand commented on SOLR-10155:
-

+1 to explicitly reject facet.contains and facet.prefix on numerics with a 
clear error message

> Clarify logic for term filters on numeric types
> ---
>
> Key: SOLR-10155
> URL: https://issues.apache.org/jira/browse/SOLR-10155
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: faceting
>Affects Versions: 6.4.1
>Reporter: Gus Heck
>Assignee: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-10155.patch
>
>
> The following code has been found to be confusing to multiple folks working 
> in SimpleFacets.java (see SOLR-10132)
> {code}
> if (termFilter != null) {
>   // TODO: understand this logic... what is the case for 
> supporting an empty string
>   // for contains on numeric facets? What does that achieve?
>   // The exception message is misleading in the case of an 
> excludeTerms filter in any case...
>   // Also maybe vulnerable to NPE on isEmpty test?
>   final boolean supportedOperation = (termFilter instanceof 
> SubstringBytesRefFilter) && ((SubstringBytesRefFilter) 
> termFilter).substring().isEmpty();
>   if (!supportedOperation) {
> throw new SolrException(ErrorCode.BAD_REQUEST, 
> FacetParams.FACET_CONTAINS + " is not supported on numeric types");
>   }
> }
> {code}
> This is found around line 482 or so. The comment in the code above is mine, 
> and won't be found in the codebase. This ticket can be resolved by 
> eliminating the complex check and just denying all termFilters with a better 
> exception message not specific to contains filters (and perhaps consolidated 
> with the proceeding check for about prefix filters?), or adding a comment to 
> the code base explaining why we need to allow a term filter with an empty, 
> non-null string to be processed, and why this isn't an NPE waiting to happen.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-10173) Enable extension/customization of HttpShardHandler by increasing visibility

2017-02-23 Thread Christine Poerschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke resolved SOLR-10173.

   Resolution: Fixed
Fix Version/s: master (7.0)
   6.x

Thanks Ramsey!

> Enable extension/customization of HttpShardHandler by increasing visibility
> ---
>
> Key: SOLR-10173
> URL: https://issues.apache.org/jira/browse/SOLR-10173
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ramsey Haddad
>Assignee: Christine Poerschke
>Priority: Minor
> Fix For: 6.x, master (7.0)
>
> Attachments: solr-10173.patch, SOLR-10173.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Increase visibility of 2 elements of HttpShardHandlerFactory from "private" 
> to "protected" to facilitate extension of the class. Make 
> ReplicaListTransformer "public" to enable implementation of the interface in 
> custom classes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10155) Clarify logic for term filters on numeric types

2017-02-23 Thread Christine Poerschke (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880988#comment-15880988
 ] 

Christine Poerschke commented on SOLR-10155:


bq. ... whether there's a use case for passing blanks through ... supplying a 
blank is the means of "turning it off" without blowing up ...

That's a fair point, yes, the change in behaviour would have to be documented 
clearly in the CHANGES.txt e.g. something along the lines of _"facet.contains= 
is now rejected for numeric types"_.

So then, yes, would it make sense to apply the same change to facet.prefix with 
a joint _"facet.contains= and facet.prefix= are now rejected for numeric 
types"_ CHANGES.txt note?

[~jpountz] - would you have any thoughts on this, following on from the (long 
time ago) SOLR-3855 commit Gus mentioned above?

> Clarify logic for term filters on numeric types
> ---
>
> Key: SOLR-10155
> URL: https://issues.apache.org/jira/browse/SOLR-10155
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: faceting
>Affects Versions: 6.4.1
>Reporter: Gus Heck
>Priority: Minor
> Attachments: SOLR-10155.patch
>
>
> The following code has been found to be confusing to multiple folks working 
> in SimpleFacets.java (see SOLR-10132)
> {code}
> if (termFilter != null) {
>   // TODO: understand this logic... what is the case for 
> supporting an empty string
>   // for contains on numeric facets? What does that achieve?
>   // The exception message is misleading in the case of an 
> excludeTerms filter in any case...
>   // Also maybe vulnerable to NPE on isEmpty test?
>   final boolean supportedOperation = (termFilter instanceof 
> SubstringBytesRefFilter) && ((SubstringBytesRefFilter) 
> termFilter).substring().isEmpty();
>   if (!supportedOperation) {
> throw new SolrException(ErrorCode.BAD_REQUEST, 
> FacetParams.FACET_CONTAINS + " is not supported on numeric types");
>   }
> }
> {code}
> This is found around line 482 or so. The comment in the code above is mine, 
> and won't be found in the codebase. This ticket can be resolved by 
> eliminating the complex check and just denying all termFilters with a better 
> exception message not specific to contains filters (and perhaps consolidated 
> with the proceeding check for about prefix filters?), or adding a comment to 
> the code base explaining why we need to allow a term filter with an empty, 
> non-null string to be processed, and why this isn't an NPE waiting to happen.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-10155) Clarify logic for term filters on numeric types

2017-02-23 Thread Christine Poerschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke reassigned SOLR-10155:
--

Assignee: Christine Poerschke

> Clarify logic for term filters on numeric types
> ---
>
> Key: SOLR-10155
> URL: https://issues.apache.org/jira/browse/SOLR-10155
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: faceting
>Affects Versions: 6.4.1
>Reporter: Gus Heck
>Assignee: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-10155.patch
>
>
> The following code has been found to be confusing to multiple folks working 
> in SimpleFacets.java (see SOLR-10132)
> {code}
> if (termFilter != null) {
>   // TODO: understand this logic... what is the case for 
> supporting an empty string
>   // for contains on numeric facets? What does that achieve?
>   // The exception message is misleading in the case of an 
> excludeTerms filter in any case...
>   // Also maybe vulnerable to NPE on isEmpty test?
>   final boolean supportedOperation = (termFilter instanceof 
> SubstringBytesRefFilter) && ((SubstringBytesRefFilter) 
> termFilter).substring().isEmpty();
>   if (!supportedOperation) {
> throw new SolrException(ErrorCode.BAD_REQUEST, 
> FacetParams.FACET_CONTAINS + " is not supported on numeric types");
>   }
> }
> {code}
> This is found around line 482 or so. The comment in the code above is mine, 
> and won't be found in the codebase. This ticket can be resolved by 
> eliminating the complex check and just denying all termFilters with a better 
> exception message not specific to contains filters (and perhaps consolidated 
> with the proceeding check for about prefix filters?), or adding a comment to 
> the code base explaining why we need to allow a term filter with an empty, 
> non-null string to be processed, and why this isn't an NPE waiting to happen.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880862#comment-15880862
 ] 

Michael McCandless commented on LUCENE-7707:


Maybe we could require that either all incoming {{shardIndex}} are undefined, 
or all are set, but you are not allowed to mix?

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Jim Ferenczi (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880859#comment-15880859
 ] 

Jim Ferenczi commented on LUCENE-7707:
--

bq. Plus I think it's very unlikely someone today is pre-setting the shardIndex 
(off of it's default -1 value) and then relying on TopDocs.merge

Good point. +1 to the patch too, there's nothing to break here ;)


> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880854#comment-15880854
 ] 

Adrien Grand commented on LUCENE-7707:
--

I don't like the fact that if you mix top docs that have the shard index set 
and other instances that have it undefined, then we could end up assigning a 
shard id that is already in use. Is there a way we can avoid that?

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10156) Add significantTerms Streaming Expression

2017-02-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-10156:
--
Attachment: SOLR-10156.patch

> Add significantTerms Streaming Expression
> -
>
> Key: SOLR-10156
> URL: https://issues.apache.org/jira/browse/SOLR-10156
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5
>
> Attachments: SOLR-10156.patch, SOLR-10156.patch, SOLR-10156.patch
>
>
> The significantTerms Streaming Expression will emit a set of terms from a 
> *text field* within a doc frequency range for a specific query. It will also 
> score the terms based on how many times the terms appear in the result set, 
> and how many times the terms appear in the corpus, and return the top N terms 
> based on this significance score.
> Syntax:
> {code}
> significantTerms(collection, 
>q="abc", 
>field="some_text_field", 
>minDocFreq="x", 
>maxDocFreq="y",
>limit="50")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880838#comment-15880838
 ] 

Michael McCandless commented on LUCENE-7707:


+1 to the patch.

bq. I personally would like that but Michael McCandless had some issues with 
this?

Yeah, I'd prefer not to add a boolean argument: that's allowing a temporary 
back compat issue to have a permanent impact on our APIs.  Our APIs should be 
designed for future usage.  Plus I think it's very unlikely someone today is 
pre-setting the shardIndex (off of it's default -1 value) and then relying on 
TopDocs.merge to overwrite it.  I think the patch is sufficient back compat 
behavior w/o a compromised API change.

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-master - Build # 1690 - Unstable

2017-02-23 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-master/1690/

3 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.core.TestLazyCores

Error Message:
ObjectTracker found 5 object(s) that were not released!!! [MMapDirectory, 
MMapDirectory, SolrCore, MMapDirectory, MDCAwareThreadPoolExecutor] 
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.lucene.store.MMapDirectory  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:347)
  at 
org.apache.solr.core.MetricsDirectoryFactory.get(MetricsDirectoryFactory.java:208)
  at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:348)  at 
org.apache.solr.core.SolrCore.initIndex(SolrCore.java:694)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:911)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:828)  at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:937)  at 
org.apache.solr.core.CoreContainer.lambda$load$3(CoreContainer.java:572)  at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)  at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
 at java.lang.Thread.run(Thread.java:745)  
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.lucene.store.MMapDirectory  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:347)
  at 
org.apache.solr.core.MetricsDirectoryFactory.get(MetricsDirectoryFactory.java:208)
  at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:98)  at 
org.apache.solr.core.SolrCore.initIndex(SolrCore.java:726)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:911)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:828)  at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:937)  at 
org.apache.solr.core.CoreContainer.lambda$load$3(CoreContainer.java:572)  at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)  at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
 at java.lang.Thread.run(Thread.java:745)  
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.solr.core.SolrCore  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at org.apache.solr.core.SolrCore.(SolrCore.java:1001)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:828)  at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:937)  at 
org.apache.solr.core.CoreContainer.lambda$load$3(CoreContainer.java:572)  at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)  at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
 at java.lang.Thread.run(Thread.java:745)  
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.lucene.store.MMapDirectory  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:347)
  at 
org.apache.solr.core.MetricsDirectoryFactory.get(MetricsDirectoryFactory.java:208)
  at 
org.apache.solr.core.SolrCore.initSnapshotMetaDataManager(SolrCore.java:479)  
at org.apache.solr.core.SolrCore.(SolrCore.java:905)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:828)  at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:937)  at 
org.apache.solr.core.CoreContainer.lambda$load$3(CoreContainer.java:572)  at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)  at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
 at 

[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download

2017-02-23 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880794#comment-15880794
 ] 

Uwe Schindler commented on SOLR-9450:
-

Actually it's much easier:
{noformat}
solr.javadoc.url=${JOB_URL}javadoc/
{noformat}

> Link to online Javadocs instead of distributing with binary download
> 
>
> Key: SOLR-9450
> URL: https://issues.apache.org/jira/browse/SOLR-9450
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, 
> SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch
>
>
> Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} 
> in the binary download with a link to the online JavaDocs. The build should 
> make sure to generate a link to the correct version. I believe this is the 
> correct tamplate: http://lucene.apache.org/solr/6_2_0/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-9450) Link to online Javadocs instead of distributing with binary download

2017-02-23 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880593#comment-15880593
 ] 

Uwe Schindler edited comment on SOLR-9450 at 2/23/17 4:25 PM:
--

I updated the Jenkins Jobs for artifacts and added:

{noformat}
solr.javadoc.url=${JENKINS_URL}job/${JOB_NAME}/javadoc/
{noformat}


was (Author: thetaphi):
I updated the Jenkins Jobs for artifacts and added:

{noformat}
solr.javadoc.url=${JENKINS_URL}/job/${JOB_NAME}/javadoc/
{noformat}

> Link to online Javadocs instead of distributing with binary download
> 
>
> Key: SOLR-9450
> URL: https://issues.apache.org/jira/browse/SOLR-9450
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, 
> SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch
>
>
> Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} 
> in the binary download with a link to the online JavaDocs. The build should 
> make sure to generate a link to the correct version. I believe this is the 
> correct tamplate: http://lucene.apache.org/solr/6_2_0/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880767#comment-15880767
 ] 

Simon Willnauer commented on LUCENE-7707:
-

I personally think making this solely dependent on a boolean would be best IMO. 
It would be an additional overload of the methods that explicitly turns on that 
shardIndex is set on the ScoreDoc and we don't need to do as much conditionals 
in the tie-breaking. I personally would like that but [~mikemccand] had some 
issues with this?

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Jim Ferenczi (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880708#comment-15880708
 ] 

Jim Ferenczi commented on LUCENE-7707:
--

+1, this will make the merge more flexible.
If we really want to be sure that it does not break the BWC maybe it can be an 
option of the merge function ? A simple boolean overrideShardIndex with a 
default value of false ?

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-7707:

Attachment: LUCENE-7707.patch

here is another iteration that makes sorting stable and shares some code for 
tiebreaking between sorting and merge by score code

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch, LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Linux (64bit/jdk1.8.0_121) - Build # 19034 - Unstable!

2017-02-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/19034/
Java: 64bit/jdk1.8.0_121 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  
org.apache.solr.security.hadoop.TestDelegationWithHadoopAuth.testDelegationTokenRenew

Error Message:
expected:<200> but was:<403>

Stack Trace:
java.lang.AssertionError: expected:<200> but was:<403>
at 
__randomizedtesting.SeedInfo.seed([CB220527BCDC9F4B:FCB9F139841042EF]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.security.hadoop.TestDelegationWithHadoopAuth.renewDelegationToken(TestDelegationWithHadoopAuth.java:118)
at 
org.apache.solr.security.hadoop.TestDelegationWithHadoopAuth.verifyDelegationTokenRenew(TestDelegationWithHadoopAuth.java:302)
at 
org.apache.solr.security.hadoop.TestDelegationWithHadoopAuth.testDelegationTokenRenew(TestDelegationWithHadoopAuth.java:319)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 

[jira] [Commented] (SOLR-10092) HDFS: AutoAddReplica fails

2017-02-23 Thread Hendrik Haddorp (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880651#comment-15880651
 ] 

Hendrik Haddorp commented on SOLR-10092:


Sorry for the spam looks like I tested my patch wrong last time. Solr 6.3 on 
HDFS with legacyMode=false fails with the stated exception. But just using my 
patch does not fix that. The exception is gone but then I get:
org.apache.solr.common.SolrException: coreNodeName core_node1 exists, but does 
not match expected node or core name: 
DocCollection(test.test3//collections/test.test3/state.json/50)={

}
at 
org.apache.solr.cloud.ZkController.checkStateInZk(ZkController.java:1562)
at 
org.apache.solr.cloud.ZkController.preRegister(ZkController.java:1488)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:837)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:779)

> HDFS: AutoAddReplica fails
> --
>
> Key: SOLR-10092
> URL: https://issues.apache.org/jira/browse/SOLR-10092
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: hdfs
>Affects Versions: 6.3
>Reporter: Hendrik Haddorp
> Attachments: SOLR-10092.patch
>
>
> OverseerAutoReplicaFailoverThread fails to create replacement core with this 
> exception:
> o.a.s.c.OverseerAutoReplicaFailoverThread Exception trying to create new 
> replica on 
> http://...:9000/solr:org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>  Error from server at http://...:9000/solr: Error CREATEing SolrCore 
> 'test2.collection-09_shard1_replica1': Unable to create core 
> [test2.collection-09_shard1_replica1] Caused by: No shard id for 
> CoreDescriptor[name=test2.collection-09_shard1_replica1;instanceDir=/var/opt/solr/test2.collection-09_shard1_replica1]
> at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:593)
> at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:262)
> at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:251)
> at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219)
> at 
> org.apache.solr.cloud.OverseerAutoReplicaFailoverThread.createSolrCore(OverseerAutoReplicaFailoverThread.java:456)
> at 
> org.apache.solr.cloud.OverseerAutoReplicaFailoverThread.lambda$addReplica$0(OverseerAutoReplicaFailoverThread.java:251)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745) 
> also see this mail thread about the issue: 
> https://lists.apache.org/thread.html/%3CCAA70BoWyzbvQuJTyzaG4Kx1tj0Djgcm+MV=x_hoac1e6cse...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10079) TestInPlaceUpdatesDistrib failure

2017-02-23 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880602#comment-15880602
 ] 

Steve Rowe commented on SOLR-10079:
---

My Jenkins found a reproducing branch_6x seed:

{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TestInPlaceUpdatesDistrib -Dtests.method=test 
-Dtests.seed=B88AA94CC5E07DDA -Dtests.slow=true -Dtests.locale=en-GB 
-Dtests.timezone=Africa/Tripoli -Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] FAILURE 40.7s J1  | TestInPlaceUpdatesDistrib.test <<<
   [junit4]> Throwable #1: java.lang.AssertionError: 'sanitycheck' results 
against client: org.apache.solr.client.solrj.impl.HttpSolrClient@3cc2aada (not 
leader) wrong [docid] for SolrDocument{id=10, 
id_field_copy_that_does_not_support_in_place_update_s=10, title_s=title10, 
id_i=10, inplace_updatable_float=101.0, _version_=1560081900526108672, 
inplace_updatable_int_with_default=666, 
inplace_updatable_float_with_default=42.0, [docid]=322} expected:<306> but 
was:<322>
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([B88AA94CC5E07DDA:30DE96966B1C1022]:0)
   [junit4]>at 
org.apache.solr.update.TestInPlaceUpdatesDistrib.assertDocIdsAndValuesInResults(TestInPlaceUpdatesDistrib.java:442)
   [junit4]>at 
org.apache.solr.update.TestInPlaceUpdatesDistrib.assertDocIdsAndValuesAgainstAllClients(TestInPlaceUpdatesDistrib.java:413)
   [junit4]>at 
org.apache.solr.update.TestInPlaceUpdatesDistrib.docValuesUpdateTest(TestInPlaceUpdatesDistrib.java:321)
   [junit4]>at 
org.apache.solr.update.TestInPlaceUpdatesDistrib.test(TestInPlaceUpdatesDistrib.java:140)
   [junit4]>at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:992)
   [junit4]>at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:967)
[...]
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene62): 
{title_s=PostingsFormat(name=Memory doPackFST= false), 
id=PostingsFormat(name=Direct), 
id_field_copy_that_does_not_support_in_place_update_s=TestBloomFilteredLucenePostings(BloomFilteringPostingsFormat(Lucene50(blocksize=128)))},
 docValues:{inplace_updatable_float=DocValuesFormat(name=Direct), 
id_i=DocValuesFormat(name=Memory), _version_=DocValuesFormat(name=Lucene54), 
title_s=DocValuesFormat(name=Direct), id=DocValuesFormat(name=Lucene54), 
id_field_copy_that_does_not_support_in_place_update_s=DocValuesFormat(name=Lucene54),
 inplace_updatable_int_with_default=DocValuesFormat(name=Direct), 
inplace_updatable_float_with_default=DocValuesFormat(name=Memory)}, 
maxPointsInLeafNode=880, maxMBSortInHeap=5.97801431291, 
sim=RandomSimilarity(queryNorm=true,coord=no): {}, locale=en-GB, 
timezone=Africa/Tripoli
   [junit4]   2> NOTE: Linux 4.1.0-custom2-amd64 amd64/Oracle Corporation 
1.8.0_77 (64-bit)/cpus=16,threads=1,free=210993488,total=526909440
{noformat}

> TestInPlaceUpdatesDistrib failure
> -
>
> Key: SOLR-10079
> URL: https://issues.apache.org/jira/browse/SOLR-10079
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Ishan Chattopadhyaya
> Attachments: SOLR-10079.patch, stdout
>
>
> From [https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18881/], 
> reproduces for me:
> {noformat}
> Checking out Revision d8d61ff61d1d798f5e3853ef66bc485d0d403f18 
> (refs/remotes/origin/master)
> [...]
>[junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TestInPlaceUpdatesDistrib -Dtests.method=test 
> -Dtests.seed=E1BB56269B8215B0 -Dtests.multiplier=3 -Dtests.slow=true 
> -Dtests.locale=sr-Latn-RS -Dtests.timezone=America/Grand_Turk 
> -Dtests.asserts=true -Dtests.file.encoding=UTF-8
>[junit4] FAILURE 77.7s J2 | TestInPlaceUpdatesDistrib.test <<<
>[junit4]> Throwable #1: java.lang.AssertionError: Earlier: [79, 79, 
> 79], now: [78, 78, 78]
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([E1BB56269B8215B0:69EF69FC357E7848]:0)
>[junit4]>  at 
> org.apache.solr.update.TestInPlaceUpdatesDistrib.ensureRtgWorksWithPartialUpdatesTest(TestInPlaceUpdatesDistrib.java:425)
>[junit4]>  at 
> org.apache.solr.update.TestInPlaceUpdatesDistrib.test(TestInPlaceUpdatesDistrib.java:142)
>[junit4]>  at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>[junit4]>  at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>[junit4]>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>

[jira] [Closed] (SOLR-7764) Solr indexing hangs if encounters an certain XML parse error

2017-02-23 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson closed SOLR-7764.

Resolution: Invalid

this is a Tika issue not a Solr one, please continue the discussion with the 
Tika project.

> Solr indexing hangs if encounters an certain XML parse error
> 
>
> Key: SOLR-7764
> URL: https://issues.apache.org/jira/browse/SOLR-7764
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 4.7.2
> Environment: Ubuntu 12.04.5 LTS
>Reporter: Sorin Gheorghiu
>  Labels: bluespice, indexing
> Attachments: Solr_XML_parse_error_080715.txt
>
>
> BlueSpice (http://bluespice.com/) uses Solr to index documents for the 
> 'Extended search' feature.
> Solr hangs if during indexing certain error occurs:
> 8.7.2015 15:34:26
> ERROR
> SolrCore
> org.apache.solr.common.SolrException: 
> org.apache.tika.exception.TikaException: XML parse error
> 8.7.2015 15:34:26
> ERROR
> SolrDispatchFilter
> null:org.apache.solr.common.SolrException: 
> org.apache.tika.exception.TikaException: XML parse error



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download

2017-02-23 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880593#comment-15880593
 ] 

Uwe Schindler commented on SOLR-9450:
-

I updated the Jenkins Jobs for artifacts and added:

{noformat}
solr.javadoc.url=${JENKINS_URL}/job/${JOB_NAME}/javadoc/
{noformat}

> Link to online Javadocs instead of distributing with binary download
> 
>
> Key: SOLR-9450
> URL: https://issues.apache.org/jira/browse/SOLR-9450
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, 
> SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch
>
>
> Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} 
> in the binary download with a link to the online JavaDocs. The build should 
> make sure to generate a link to the correct version. I believe this is the 
> correct tamplate: http://lucene.apache.org/solr/6_2_0/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10197) SolrException during indexing

2017-02-23 Thread Alexandre Rafalovitch (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880590#comment-15880590
 ] 

Alexandre Rafalovitch commented on SOLR-10197:
--

Please note that these kinds of questions are best to be asked on the Solr 
Users mailing list as they are usually a configuration issue. JIRA is used to 
report errors against Lucene/Solr itself. Also, it is important to provide a 
Solr product version.

However, from the quick look at the exception, it seems that you have a 
MoreLikeThis component activated that has a numeric field configured as part of 
its similarity field list. When  the search term is textual (and not numeric) 
and Solr tries to expand the query against the numeric field, this causes an 
exception. I would check the specific query issued to Solr, look at the 
definition of the request handler it is issued against (in solrconfig.xml) and 
check the MLT field list configuration and the types of those fields.



> SolrException during indexing
> -
>
> Key: SOLR-10197
> URL: https://issues.apache.org/jira/browse/SOLR-10197
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Server
>Affects Versions: 4.5
> Environment: Ubuntu 14.04.5 LTS
>Reporter: Sorin Gheorghiu
>  Labels: bluespice, indexing
> Attachments: BS_Solr_error_invalid_no.txt
>
>
> BlueSpice (http://bluespice.com/) uses Solr to index documents for the 
> 'Extended search' feature. Solr hangs consistently during indexing and an 
> error occurs (see attached).
> In the ExtendedSearch.log there is no error, but the latest indexed 
> document/wiki page:
> 22.02.2017 17:45:11
> Zu indexierende Artikel: 4205
> 1: Indexiere Wiki Seiten: 1% - WUI netz.xls
> 2: Indexiere Wiki Seiten: 1% - IndividArbanw.pdf
> ...
> 3526: Indexiere Wiki Seiten: 84% - 2007
> 3527: Indexiere Wiki Seiten: 84% - Buchdurchlaufzeit
> 3528: Indexiere Wiki Seiten: 84% - Mahnroutinen
> 3529: Indexiere Wiki Seiten: 84% - Software für Informationskompetenz
> Could you provide any indication of the error?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-7707:

Attachment: LUCENE-7707.patch

here is a patch

> Only assign ScoreDoc#shardIndex if it was already assigned to non default 
> (-1) value
> 
>
> Key: LUCENE-7707
> URL: https://issues.apache.org/jira/browse/LUCENE-7707
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7707.patch
>
>
> When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
> value. The assumption that is made here is that all shard results are merges 
> at once which is not necessarily the case. If for instance incremental merge 
> phases are applied the shard index doesn't correspond to the index in the 
> outer TopDocs array. To make this a backwards compatible but yet 
> non-controversial change we could change the internals of TopDocs#merge to 
> only assign this value unless it's not been assigned before to a non-default 
> (-1) value to allow multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-7707) Only assign ScoreDoc#shardIndex if it was already assigned to non default (-1) value

2017-02-23 Thread Simon Willnauer (JIRA)
Simon Willnauer created LUCENE-7707:
---

 Summary: Only assign ScoreDoc#shardIndex if it was already 
assigned to non default (-1) value
 Key: LUCENE-7707
 URL: https://issues.apache.org/jira/browse/LUCENE-7707
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Simon Willnauer
 Fix For: master (7.0), 6.5.0


When you use TopDocs.merge today it always overrides the ScoreDoc#shardIndex 
value. The assumption that is made here is that all shard results are merges at 
once which is not necessarily the case. If for instance incremental merge 
phases are applied the shard index doesn't correspond to the index in the outer 
TopDocs array. To make this a backwards compatible but yet non-controversial 
change we could change the internals of TopDocs#merge to only assign this value 
unless it's not been assigned before to a non-default (-1) value to allow 
multiple or sparse top docs merging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download

2017-02-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880568#comment-15880568
 ] 

ASF subversion and git services commented on SOLR-9450:
---

Commit 9ecc1ec79db7ed2b7f8f7bb4ce6cf93d2ce3c382 in lucene-solr's branch 
refs/heads/branch_6x from [~thetaphi]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9ecc1ec ]

SOLR-9450: The docs/ folder in the binary distribution now contains a single 
index.html file linking to the online documentation, reducing the size of the 
download


> Link to online Javadocs instead of distributing with binary download
> 
>
> Key: SOLR-9450
> URL: https://issues.apache.org/jira/browse/SOLR-9450
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, 
> SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch
>
>
> Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} 
> in the binary download with a link to the online JavaDocs. The build should 
> make sure to generate a link to the correct version. I believe this is the 
> correct tamplate: http://lucene.apache.org/solr/6_2_0/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-9450) Link to online Javadocs instead of distributing with binary download

2017-02-23 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved SOLR-9450.
-
Resolution: Fixed

> Link to online Javadocs instead of distributing with binary download
> 
>
> Key: SOLR-9450
> URL: https://issues.apache.org/jira/browse/SOLR-9450
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, 
> SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch
>
>
> Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} 
> in the binary download with a link to the online JavaDocs. The build should 
> make sure to generate a link to the correct version. I believe this is the 
> correct tamplate: http://lucene.apache.org/solr/6_2_0/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download

2017-02-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880566#comment-15880566
 ] 

ASF subversion and git services commented on SOLR-9450:
---

Commit 894a43b259a72a82f07649b0d93ab3c17c4d89c4 in lucene-solr's branch 
refs/heads/master from [~thetaphi]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=894a43b ]

SOLR-9450: The docs/ folder in the binary distribution now contains a single 
index.html file linking to the online documentation, reducing the size of the 
download


> Link to online Javadocs instead of distributing with binary download
> 
>
> Key: SOLR-9450
> URL: https://issues.apache.org/jira/browse/SOLR-9450
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, 
> SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch
>
>
> Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} 
> in the binary download with a link to the online JavaDocs. The build should 
> make sure to generate a link to the correct version. I believe this is the 
> correct tamplate: http://lucene.apache.org/solr/6_2_0/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10076) Hiding keystore and truststore passwords from /admin/info/* outputs

2017-02-23 Thread Mano Kovacs (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880512#comment-15880512
 ] 

Mano Kovacs commented on SOLR-10076:


Thank you for the feedback, [~ichattopadhyaya].

Do you think the redaction of command line password could be handled as the 
first patch contains?

> Hiding keystore and truststore passwords from /admin/info/* outputs
> ---
>
> Key: SOLR-10076
> URL: https://issues.apache.org/jira/browse/SOLR-10076
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mano Kovacs
> Attachments: SOLR-10076.patch
>
>
> Passing keystore and truststore password is done by system properties, via 
> cmd line parameter.
> As result, {{/admin/info/properties}} and {{/admin/info/system}} will print 
> out the received password.
> Proposing solution to automatically redact value of any system property 
> before output, containing the word {{password}}, and replacing its value with 
> {{**}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9527) Solr RESTORE api doesn't distribute the replicas uniformly

2017-02-23 Thread Dewald Viljoen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880507#comment-15880507
 ] 

Dewald Viljoen commented on SOLR-9527:
--

I've run smack dab straight into this issue recently. I was wondering what the 
progress is on this patch?

I'm currently running Solr 6.4.1 and would really like to take advantage of the 
Collections Backup/Restore functionality in combination with HDFS. All works 
well until I restore the collection and all my shards end up on one of my 
SolrCloud nodes. I can specify a replicationFactor of 2 and then though some 
other API calls make the replica's the leaders and rebalance everything but 
it's a bit of a mess.

I'm happy to lend my efforts to get this issue resolved.

> Solr RESTORE api doesn't distribute the replicas uniformly 
> ---
>
> Key: SOLR-9527
> URL: https://issues.apache.org/jira/browse/SOLR-9527
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.1
>Reporter: Hrishikesh Gadre
> Attachments: SOLR-9527.patch, SOLR-9527.patch, SOLR-9527.patch
>
>
> Please refer to this email thread for details,
> http://lucene.markmail.org/message/ycun4x5nx7lwj5sk?q=solr+list:org%2Eapache%2Elucene%2Esolr-user+order:date-backward=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download

2017-02-23 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880505#comment-15880505
 ] 

Uwe Schindler commented on SOLR-9450:
-

So I think I start with committing the current patch and we can work on 
improving the documentation.

> Link to online Javadocs instead of distributing with binary download
> 
>
> Key: SOLR-9450
> URL: https://issues.apache.org/jira/browse/SOLR-9450
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, 
> SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch
>
>
> Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} 
> in the binary download with a link to the online JavaDocs. The build should 
> make sure to generate a link to the correct version. I believe this is the 
> correct tamplate: http://lucene.apache.org/solr/6_2_0/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download

2017-02-23 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880503#comment-15880503
 ] 

Uwe Schindler commented on SOLR-9450:
-

Ah OK, cool.

> Link to online Javadocs instead of distributing with binary download
> 
>
> Key: SOLR-9450
> URL: https://issues.apache.org/jira/browse/SOLR-9450
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, 
> SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch
>
>
> Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} 
> in the binary download with a link to the online JavaDocs. The build should 
> make sure to generate a link to the correct version. I believe this is the 
> correct tamplate: http://lucene.apache.org/solr/6_2_0/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-7419) performance bug in tokenstream.end()

2017-02-23 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-7419.

Resolution: Fixed

OK I backported to 5.5.x branch so next release we do here will have it.

> performance bug in tokenstream.end()
> 
>
> Key: LUCENE-7419
> URL: https://issues.apache.org/jira/browse/LUCENE-7419
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Blocker
> Fix For: master (7.0), 5.5.5, 6.2
>
> Attachments: LUCENE-7419.patch
>
>
> TokenStream.end() calls getAttribute(), which is pretty costly to do 
> per-stream.
> It does its current hack, because in the ctor of TokenStream is "too early".
> Instead, we can just add a variant of clear(), called end() to AttributeImpl. 
> For most attributes it defers to clear, but for PosIncAtt it can handle the 
> special case.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7419) performance bug in tokenstream.end()

2017-02-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880500#comment-15880500
 ] 

ASF subversion and git services commented on LUCENE-7419:
-

Commit 4dbaed52a0a721b2b9668ee8074da42585fd54ea in lucene-solr's branch 
refs/heads/branch_5_5 from [~rcmuir]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4dbaed5 ]

LUCENE-7419: Don't lookup PositionIncrementAttribute every time in 
TokenStream.end()


> performance bug in tokenstream.end()
> 
>
> Key: LUCENE-7419
> URL: https://issues.apache.org/jira/browse/LUCENE-7419
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Blocker
> Fix For: master (7.0), 6.2, 5.5.5
>
> Attachments: LUCENE-7419.patch
>
>
> TokenStream.end() calls getAttribute(), which is pretty costly to do 
> per-stream.
> It does its current hack, because in the ctor of TokenStream is "too early".
> Instead, we can just add a variant of clear(), called end() to AttributeImpl. 
> For most attributes it defers to clear, but for PosIncAtt it can handle the 
> special case.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7704) SysnonymGraphFilter doesn't respect ignoreCase parameter

2017-02-23 Thread Sebastian Yonekura Baeza (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880486#comment-15880486
 ] 

Sebastian Yonekura Baeza commented on LUCENE-7704:
--

Oh, sorry I missed those docs, given that it was a deprecated class I didn't 
pay much attention to it. 

Indeed, without the javadocs the parameter {{ignoreCase}} was kind of 
misleading. Thank you [~mikemccand] for the clarification!

> SysnonymGraphFilter doesn't respect ignoreCase parameter
> 
>
> Key: LUCENE-7704
> URL: https://issues.apache.org/jira/browse/LUCENE-7704
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: 6.4.1
>Reporter: Sebastian Yonekura Baeza
>Priority: Minor
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7704.patch
>
>
> Hi, it seems that SynonymGraphFilter doesn't respect ignoreCase parameter. In 
> particular this test doesn't pass:
> {code:title=UppercaseSynonymMapTest.java|borderStyle=solid}
> package com.mapcity.suggest.lucene;
> import org.apache.lucene.analysis.Analyzer;
> import org.apache.lucene.analysis.TokenStream;
> import org.apache.lucene.analysis.Tokenizer;
> import org.apache.lucene.analysis.core.WhitespaceTokenizer;
> import org.apache.lucene.analysis.synonym.SynonymGraphFilter;
> import org.apache.lucene.analysis.synonym.SynonymMap;
> import org.apache.lucene.util.CharsRef;
> import org.apache.lucene.util.CharsRefBuilder;
> import org.junit.Test;
> import java.io.IOException;
> import static 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents;
> /**
>  * @author Sebastian Yonekura
>  * Created on 22-02-17
>  */
> public class UppercaseSynonymMapTest {
> @Test
> public void analyzerTest01() throws IOException {
> // This passes
> testAssertMapping("word", "synonym");
> // this one not
> testAssertMapping("word".toUpperCase(), "synonym");
> }
> private void testAssertMapping(String inputString, String outputString) 
> throws IOException {
> SynonymMap.Builder builder = new SynonymMap.Builder(false);
> CharsRef input = SynonymMap.Builder.join(inputString.split(" "), new 
> CharsRefBuilder());
> CharsRef output = SynonymMap.Builder.join(outputString.split(" "), 
> new CharsRefBuilder());
> builder.add(input, output, true);
> Analyzer analyzer = new CustomAnalyzer(builder.build());
> TokenStream tokenStream = analyzer.tokenStream("field", inputString);
> assertTokenStreamContents(tokenStream, new String[]{
> outputString, inputString
> });
> }
> static class CustomAnalyzer extends Analyzer {
> private SynonymMap synonymMap;
> CustomAnalyzer(SynonymMap synonymMap) {
> this.synonymMap = synonymMap;
> }
> @Override
> protected TokenStreamComponents createComponents(String s) {
> Tokenizer tokenizer = new WhitespaceTokenizer();
> TokenStream tokenStream = new SynonymGraphFilter(tokenizer, 
> synonymMap, true); // Ignore case True
> return new TokenStreamComponents(tokenizer, tokenStream);
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9450) Link to online Javadocs instead of distributing with binary download

2017-02-23 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880477#comment-15880477
 ] 

Steve Rowe commented on SOLR-9450:
--

{quote}
bq. Update website quickstart.mdtext to suggest indexing something else than 
local javadocs
There is also a quickstart.mdext in the checkout's site directory. We should 
change it there, too.
{quote}

FYI, there is an Ant target to convert the distribution doc version into the 
website version - see the last bullet in item #1 here: 
[https://wiki.apache.org/lucene-java/ReleaseTodo#Update_the_rest_of_the_website].

> Link to online Javadocs instead of distributing with binary download
> 
>
> Key: SOLR-9450
> URL: https://issues.apache.org/jira/browse/SOLR-9450
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch, 
> SOLR-9450.patch, SOLR-9450.patch, SOLR-9450.patch
>
>
> Spinoff from SOLR-6806. This sub task will replace the contents of {{docs}} 
> in the binary download with a link to the online JavaDocs. The build should 
> make sure to generate a link to the correct version. I believe this is the 
> correct tamplate: http://lucene.apache.org/solr/6_2_0/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-7704) SysnonymGraphFilter doesn't respect ignoreCase parameter

2017-02-23 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-7704.

   Resolution: Fixed
Fix Version/s: 6.5.0
   master (7.0)

> SysnonymGraphFilter doesn't respect ignoreCase parameter
> 
>
> Key: LUCENE-7704
> URL: https://issues.apache.org/jira/browse/LUCENE-7704
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: 6.4.1
>Reporter: Sebastian Yonekura Baeza
>Priority: Minor
> Fix For: master (7.0), 6.5.0
>
> Attachments: LUCENE-7704.patch
>
>
> Hi, it seems that SynonymGraphFilter doesn't respect ignoreCase parameter. In 
> particular this test doesn't pass:
> {code:title=UppercaseSynonymMapTest.java|borderStyle=solid}
> package com.mapcity.suggest.lucene;
> import org.apache.lucene.analysis.Analyzer;
> import org.apache.lucene.analysis.TokenStream;
> import org.apache.lucene.analysis.Tokenizer;
> import org.apache.lucene.analysis.core.WhitespaceTokenizer;
> import org.apache.lucene.analysis.synonym.SynonymGraphFilter;
> import org.apache.lucene.analysis.synonym.SynonymMap;
> import org.apache.lucene.util.CharsRef;
> import org.apache.lucene.util.CharsRefBuilder;
> import org.junit.Test;
> import java.io.IOException;
> import static 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents;
> /**
>  * @author Sebastian Yonekura
>  * Created on 22-02-17
>  */
> public class UppercaseSynonymMapTest {
> @Test
> public void analyzerTest01() throws IOException {
> // This passes
> testAssertMapping("word", "synonym");
> // this one not
> testAssertMapping("word".toUpperCase(), "synonym");
> }
> private void testAssertMapping(String inputString, String outputString) 
> throws IOException {
> SynonymMap.Builder builder = new SynonymMap.Builder(false);
> CharsRef input = SynonymMap.Builder.join(inputString.split(" "), new 
> CharsRefBuilder());
> CharsRef output = SynonymMap.Builder.join(outputString.split(" "), 
> new CharsRefBuilder());
> builder.add(input, output, true);
> Analyzer analyzer = new CustomAnalyzer(builder.build());
> TokenStream tokenStream = analyzer.tokenStream("field", inputString);
> assertTokenStreamContents(tokenStream, new String[]{
> outputString, inputString
> });
> }
> static class CustomAnalyzer extends Analyzer {
> private SynonymMap synonymMap;
> CustomAnalyzer(SynonymMap synonymMap) {
> this.synonymMap = synonymMap;
> }
> @Override
> protected TokenStreamComponents createComponents(String s) {
> Tokenizer tokenizer = new WhitespaceTokenizer();
> TokenStream tokenStream = new SynonymGraphFilter(tokenizer, 
> synonymMap, true); // Ignore case True
> return new TokenStreamComponents(tokenizer, tokenStream);
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >