[jira] [Created] (SOLR-7539) Add a QueryAutofilteringComponent for query introspection using indexed metadata

2015-05-13 Thread Ted Sullivan (JIRA)
Ted Sullivan created SOLR-7539:
--

 Summary: Add a QueryAutofilteringComponent for query introspection 
using indexed metadata
 Key: SOLR-7539
 URL: https://issues.apache.org/jira/browse/SOLR-7539
 Project: Solr
  Issue Type: New Feature
Reporter: Ted Sullivan
Priority: Minor


The Query Autofiltering Component provides a method of inferring user intent by 
matching noun phrases that are typically used for faceted-navigation into Solr 
filter or boost queries (depending on configuration settings) so that more 
precise user queries can be met with more precise results.

The algorithm uses a longest contiguous phrase match strategy which allows it 
to disambiguate queries where single terms are ambiguous but phrases are not. 
It will work when there is structured information in the form of String fields 
that are normally used for faceted navigation. It works across fields by 
building a map of search term to index field using the Lucene FieldCache 
(UninvertingReader). This enables users to create free text, multi-term queries 
that combine attributes across facet fields - as if they had searched and then 
navigated through several facet layers. To address the problem of exact-match 
only semantics of String fields, support for synonyms (including multi-term 
synonyms) and stemming was added. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Recent Java 9 commit (e5b66323ae45) breaks fsync on directory

2015-05-13 Thread Uwe Schindler
Hi Brian,

 

many thanks for opening this issue! I agree with Alan that adding an OpenOption 
would be a good possibility. In any case, as Files only contains static 
methods, we could still add a “utility” method that forces file/directory 
buffers to disk, that just uses the new open option under the hood. By that 
FileSystem SPI interfaces do not need to be modified and just need to take care 
about the new OpenOption (if supported).

 

There is one additional issue we found recently on MacOSX, but this is only 
slightly related to the one here. It looks like on MacOSX, FileChannel#force is 
mostly a noop regarding syncing data to disk, because the underlying operating 
system requires a “special” fnctl to force buffers to disk device:

 

https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/fsync.2.html:

 For applications that require tighter guarantees about the integrity of 
their data, Mac OS X provides

 the F_FULLFSYNC fcntl.  The F_FULLFSYNC fcntl asks the drive to flush all 
buffered data to permanent

 storage.  Applications, such as databases, that require a strict ordering 
of writes should use F_FULLFSYNC

 to ensure that their data is written in the order they expect.  Please see 
fcntl(2) for more

 detail.

 

This different behavior breaks the guarantees of FileChannel#force on MacOSX 
(as described in Javadocs). So the MacOSX FileSystemProvider implementation 
should use this special fnctl to force file buffers to disk.

 

Should I open a bug report on bugs.sun.com?

 

Uwe

 

-

Uwe Schindler

uschind...@apache.org 

ASF Member, Apache Lucene PMC / Committer

Bremen, Germany

http://lucene.apache.org/

 

From: nio-dev [mailto:nio-dev-boun...@openjdk.java.net] On Behalf Of Brian 
Burkhalter
Sent: Wednesday, May 13, 2015 12:26 AM
To: nio-dev
Cc: rory.odonn...@oracle.com; dev@lucene.apache.org; Balchandra Vaidya
Subject: Re: Recent Java 9 commit (e5b66323ae45) breaks fsync on directory

 

I have created an enhancement issue here:

 

https://bugs.openjdk.java.net/browse/JDK-8080235

 

Brian

 

On May 12, 2015, at 3:10 PM, Brian Burkhalter brian.burkhal...@oracle.com 
wrote:





I will create an issue now and post the ID.

 



[jira] [Commented] (SOLR-6220) Replica placement strategy for solrcloud

2015-05-13 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541638#comment-14541638
 ] 

Noble Paul commented on SOLR-6220:
--

[~mewmewball] That's the plan 

it should be added for addReplica , createShard and splitShard 

 Replica placement strategy for solrcloud
 

 Key: SOLR-6220
 URL: https://issues.apache.org/jira/browse/SOLR-6220
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-6220.patch, SOLR-6220.patch, SOLR-6220.patch, 
 SOLR-6220.patch, SOLR-6220.patch, SOLR-6220.patch, SOLR-6220.patch


 h1.Objective
 Most cloud based systems allow to specify rules on how the replicas/nodes of 
 a cluster are allocated . Solr should have a flexible mechanism through which 
 we should be able to control allocation of replicas or later change it to 
 suit the needs of the system
 All configurations are per collection basis. The rules are applied whenever a 
 replica is created in any of the shards in a given collection during
  * collection creation
  * shard splitting
  * add replica
  * createsshard
 There are two aspects to how replicas are placed: snitch and placement. 
 h2.snitch 
 How to identify the tags of nodes. Snitches are configured through collection 
 create command with the snitch param  . eg: snitch=EC2Snitch or 
 snitch=class:EC2Snitch
 h2.ImplicitSnitch 
 This is shipped by default with Solr. user does not need to specify 
 {{ImplicitSnitch}} in configuration. If the tags known to ImplicitSnitch are 
 present in the rules , it is automatically used,
 tags provided by ImplicitSnitch
 # cores :  No:of cores in the node
 # disk : Disk space available in the node 
 # host : host name of the node
 # node: node name 
 # D.* : These are values available from systrem propertes. {{D.key}} means a 
 value that is passed to the node as {{-Dkey=keyValue}} during the node 
 startup. It is possible to use rules like {{D.key:expectedVal,shard:*}}
 h2.Rules 
 This tells how many replicas for a given shard needs to be assigned to nodes 
 with the given key value pairs. These parameters will be passed on to the 
 collection CREATE api as a multivalued parameter  rule . The values will be 
 saved in the state of the collection as follows
 {code:Javascript}
 {
  “mycollection”:{
   “snitch”: {
   class:“ImplicitSnitch”
 }
   “rules”:[{cores:4-}, 
  {replica:1 ,shard :* ,node:*},
  {disk:100}]
 }
 {code}
 A rule is specified as a pseudo JSON syntax . which is a map of keys and 
 values
 *Each collection can have any number of rules. As long as the rules do not 
 conflict with each other it should be OK. Or else an error is thrown
 * In each rule , shard and replica can be omitted
 ** default value of  replica is {{\*}} means ANY or you can specify a count 
 and an operand such as {{}} (less than) or {{}} (greater than)
 ** and the value of shard can be a shard name or  {{\*}} means EACH  or 
 {{**}} means ANY.  default value is {{\*\*}} (ANY)
 * There should be exactly one extra condition in a rule other than {{shard}} 
 and {{replica}}.  
 * all keys other than {{shard}} and {{replica}} are called tags and the tags 
 are nothing but values provided by the snitch for each node
 * By default certain tags such as {{node}}, {{host}}, {{port}} are provided 
 by the system implicitly 
 h3.How are nodes picked up? 
 Nodes are not picked up in random. The rules are used to first sort the nodes 
 according to affinity. For example, if there is a rule that says 
 {{disk:100+}} , nodes with  more disk space are given higher preference.  And 
 if the rule is {{disk:100-}} nodes with lesser disk space will be given 
 priority. If everything else is equal , nodes with fewer cores are given 
 higher priority
 h3.Fuzzy match
 Fuzzy match can be applied when strict matches fail .The values can be 
 prefixed {{~}} to specify fuzziness
 example rule
 {noformat}
  #Example requirement use only one replica of a shard in a host if possible, 
 if no matches found , relax that rule. 
 rack:*,shard:*,replica:2~
 #Another example, assign all replicas to nodes with disk space of 100GB or 
 more,, or relax the rule if not possible. This will ensure that if a node 
 does not exist with 100GB disk, nodes are picked up the order of size say a 
 85GB node would be picked up over 80GB disk node
 disk:100~
 {noformat}
 Examples:
 {noformat}
 #in each rack there can be max two replicas of A given shard
  rack:*,shard:*,replica:3
 //in each rack there can be max two replicas of ANY replica
  rack:*,shard:**,replica:2
  rack:*,replica:3
  #in each node there should be a max one replica of EACH shard
  node:*,shard:*,replica:1-
  #in each node there should be a max one replica of ANY shard
  

[jira] [Commented] (SOLR-7143) MoreLikeThis Query Parser does not handle multiple field names

2015-05-13 Thread Jens Wille (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541695#comment-14541695
 ] 

Jens Wille commented on SOLR-7143:
--

Hi Anshum, can you say anything about the status of this issue? Can you give me 
any pointers as to what I might be able to do?

 MoreLikeThis Query Parser does not handle multiple field names
 --

 Key: SOLR-7143
 URL: https://issues.apache.org/jira/browse/SOLR-7143
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 5.0
Reporter: Jens Wille
Assignee: Anshum Gupta
 Attachments: SOLR-7143.patch, SOLR-7143.patch


 The newly introduced MoreLikeThis Query Parser (SOLR-6248) does not return 
 any results when supplied with multiple fields in the {{qf}} parameter.
 To reproduce within the techproducts example, compare:
 {code}
 curl 
 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name%7DMA147LL/A'
 curl 
 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=features%7DMA147LL/A'
 curl 
 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name,features%7DMA147LL/A'
 {code}
 The first two queries return 8 and 5 results, respectively. The third query 
 doesn't return any results (not even the matched document).
 In contrast, the MoreLikeThis Handler works as expected (accounting for the 
 default {{mintf}} and {{mindf}} values in SimpleMLTQParser):
 {code}
 curl 
 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=namemlt.mintf=1mlt.mindf=1'
 curl 
 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=featuresmlt.mintf=1mlt.mindf=1'
 curl 
 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=name,featuresmlt.mintf=1mlt.mindf=1'
 {code}
 After adding the following line to 
 {{example/techproducts/solr/techproducts/conf/solrconfig.xml}}:
 {code:language=XML}
 requestHandler name=/mlt class=solr.MoreLikeThisHandler /
 {code}
 The first two queries return 7 and 4 results, respectively (excluding the 
 matched document). The third query returns 7 results, as one would expect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7539) Add a QueryAutofilteringComponent for query introspection using indexed metadata

2015-05-13 Thread Ted Sullivan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541721#comment-14541721
 ] 

Ted Sullivan commented on SOLR-7539:


Initial patch uploaded.  I have published a blog article explaining the 
rationale of this component, etc at 

http://lucidworks.com/blog/query-autofiltering-revisited-can-precise/

 Add a QueryAutofilteringComponent for query introspection using indexed 
 metadata
 

 Key: SOLR-7539
 URL: https://issues.apache.org/jira/browse/SOLR-7539
 Project: Solr
  Issue Type: New Feature
Reporter: Ted Sullivan
Priority: Minor
 Attachments: SOLR-7539.patch


 The Query Autofiltering Component provides a method of inferring user intent 
 by matching noun phrases that are typically used for faceted-navigation into 
 Solr filter or boost queries (depending on configuration settings) so that 
 more precise user queries can be met with more precise results.
 The algorithm uses a longest contiguous phrase match strategy which allows 
 it to disambiguate queries where single terms are ambiguous but phrases are 
 not. It will work when there is structured information in the form of String 
 fields that are normally used for faceted navigation. It works across fields 
 by building a map of search term to index field using the Lucene FieldCache 
 (UninvertingReader). This enables users to create free text, multi-term 
 queries that combine attributes across facet fields - as if they had searched 
 and then navigated through several facet layers. To address the problem of 
 exact-match only semantics of String fields, support for synonyms (including 
 multi-term synonyms) and stemming was added. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7538) Get count of facet.pivot on distinct combination of fileds

2015-05-13 Thread Nagabhushan (JIRA)
Nagabhushan created SOLR-7538:
-

 Summary: Get count of facet.pivot on distinct combination of fileds
 Key: SOLR-7538
 URL: https://issues.apache.org/jira/browse/SOLR-7538
 Project: Solr
  Issue Type: Task
 Environment: 4.10
Reporter: Nagabhushan
Priority: Trivial


Hi I need to get action wise count in a campaign. Using 
facet.pivot=campaignId,action to get it.

Ex : campaignId,id,action
   1,1,a
   1,1,a
   1,2,a
   1,2,b

When I do  facet.pivot I get {a:3,b:1}, Facet considers duplicate rows in count.

I need distinct by combination of campaignId,id,action which is {a:2,b:1}

Thanks,




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7508) SolrParams.toMultiMap() does not handle arrays

2015-05-13 Thread Thomas Scheffler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Scheffler updated SOLR-7508:
---
Attachment: SOLRJ-7508.patch

Provided patch to fix the issue.

 SolrParams.toMultiMap() does not handle arrays
 --

 Key: SOLR-7508
 URL: https://issues.apache.org/jira/browse/SOLR-7508
 Project: Solr
  Issue Type: Bug
  Components: SolrJ
Affects Versions: 5.0, 5.1
Reporter: Thomas Scheffler
  Labels: easyfix, easytest
 Attachments: SOLRJ-7508.patch


 Following JUnit test to show what I mean:
 {code}
 ModifiableSolrParams params = new ModifiableSolrParams();
 String[] paramValues = new String[] { title:junit, author:john };
 String paramName = fq;
 params.add(paramName, paramValues);
 NamedListObject namedList = params.toNamedList();
 assertEquals(parameter values are not equal, paramValues, 
 namedList.get(paramName));
 MapString, String[] multiMap = SolrParams.toMultiMap(namedList);
 assertEquals(Expected  + paramValues.length +  values, 
 paramValues.length, multiMap.get(paramName).length);
 {code}
 The first {{assertEquals()}} will run fine, while the last one triggers the 
 error. Suddenly the length of the array is 1 and it's value of {{fq}} is 
 like {{[Ljava.lang.String;@6f09c9c0}}. Looking into the code I see that the 
 toMultiMap() method does not even look for arrays.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7539) Add a QueryAutofilteringComponent for query introspection using indexed metadata

2015-05-13 Thread Ted Sullivan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Sullivan updated SOLR-7539:
---
Attachment: SOLR-7539.patch

 Add a QueryAutofilteringComponent for query introspection using indexed 
 metadata
 

 Key: SOLR-7539
 URL: https://issues.apache.org/jira/browse/SOLR-7539
 Project: Solr
  Issue Type: New Feature
Reporter: Ted Sullivan
Priority: Minor
 Attachments: SOLR-7539.patch


 The Query Autofiltering Component provides a method of inferring user intent 
 by matching noun phrases that are typically used for faceted-navigation into 
 Solr filter or boost queries (depending on configuration settings) so that 
 more precise user queries can be met with more precise results.
 The algorithm uses a longest contiguous phrase match strategy which allows 
 it to disambiguate queries where single terms are ambiguous but phrases are 
 not. It will work when there is structured information in the form of String 
 fields that are normally used for faceted navigation. It works across fields 
 by building a map of search term to index field using the Lucene FieldCache 
 (UninvertingReader). This enables users to create free text, multi-term 
 queries that combine attributes across facet fields - as if they had searched 
 and then navigated through several facet layers. To address the problem of 
 exact-match only semantics of String fields, support for synonyms (including 
 multi-term synonyms) and stemming was added. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6450) Add simple encoded GeoPointField type to core

2015-05-13 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541840#comment-14541840
 ] 

Uwe Schindler commented on LUCENE-6450:
---

bq. As a side note, I'm finishing up a patch that uses precision_step for 
indexing the longs at variable resolution to take advantage of the postings 
list and not visit every term

Cool! :-)

 Add simple encoded GeoPointField type to core
 -

 Key: LUCENE-6450
 URL: https://issues.apache.org/jira/browse/LUCENE-6450
 Project: Lucene - Core
  Issue Type: New Feature
Affects Versions: Trunk, 5.x
Reporter: Nicholas Knize
Priority: Minor
 Attachments: LUCENE-6450-5x.patch, LUCENE-6450-TRUNK.patch, 
 LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch


 At the moment all spatial capabilities, including basic point based indexing 
 and querying, require the lucene-spatial module. The spatial module, designed 
 to handle all things geo, requires dependency overhead (s4j, jts) to provide 
 spatial rigor for even the most simplistic spatial search use-cases (e.g., 
 lat/lon bounding box, point in poly, distance search). This feature trims the 
 overhead by adding a new GeoPointField type to core along with 
 GeoBoundingBoxQuery and GeoPolygonQuery classes to the .search package. This 
 field is intended as a straightforward lightweight type for the most basic 
 geo point use-cases without the overhead. 
 The field uses simple bit twiddling operations (currently morton hashing) to 
 encode lat/lon into a single long term.  The queries leverage simple 
 multi-phase filtering that starts by leveraging NumericRangeQuery to reduce 
 candidate terms deferring the more expensive mathematics to the smaller 
 candidate sets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6480) Extend Simple GeoPointField Type to 3d

2015-05-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541850#comment-14541850
 ] 

Karl Wright commented on LUCENE-6480:
-

So my idea for an (x,y,z) based geohash is as follows:

  - three bits per split iteration: each splits x,y,z into a smaller range
  - initial range for each dimension is -1 to 1, thus size 2.
  - the first split determines the sign, and is thus backwards: e.g. -1 = x  
0 yields bit 0, 0 = x = 1 yields bit 1.
  - second bit splits range, e.g. 00 means -0.5 = x  0.

Questions:
  - Q1: how precise is it to fit in a long? A: 64/3 = 21 splits with 1 bit left 
over.  2/(2^21) = 2^(-20) = 6.07585906982421875 meters
  - Q2: how to quickly convert to a geocode value?
A: need bit manipulation of mantissa and exponent for this; requires 
further thought (and maybe a hash change)
  - Q2: how to quickly convert back to usable (x,y,z) from a geocode value?
A: first, geo3d has to tolerate imprecision in evaluation.  It does, 
possibly excepting small GeoCircles.  Otherwise, similar bit manipulation of 
mantissa and exponent in a double.

Once there's a reversible packing method, it's pretty trivial to make use of 
all geo3d shapes.


 Extend Simple GeoPointField Type to 3d 
 ---

 Key: LUCENE-6480
 URL: https://issues.apache.org/jira/browse/LUCENE-6480
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Nicholas Knize

 [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] proposes a 
 simple GeoPointField type to lucene core. This field uses 64bit encoding of 2 
 dimensional points to construct sorted term representations of GeoPoints 
 (aka: GeoHashing).
 This feature investigates adding support for encoding 3 dimensional 
 GeoPoints, either by extending GeoPointField to a Geo3DPointField or adding 
 an additional 3d constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7540) SSLMigrationTest urlScheme isn't tested properly

2015-05-13 Thread Steve Rowe (JIRA)
Steve Rowe created SOLR-7540:


 Summary: SSLMigrationTest urlScheme isn't tested properly
 Key: SOLR-7540
 URL: https://issues.apache.org/jira/browse/SOLR-7540
 Project: Solr
  Issue Type: Bug
Reporter: Steve Rowe
Priority: Minor


I noticed that {{SSLMigrationTest.assertReplicaInformation(urlScheme)}} only 
checks that a replicas' base url *starts with* the given url scheme - since the 
urlScheme can only be http or https, this check will always succeed when 
the given urlScheme is http.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7540) SSLMigrationTest urlScheme isn't tested properly

2015-05-13 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541868#comment-14541868
 ] 

Steve Rowe commented on SOLR-7540:
--

This fixes it:

{noformat}
Index: solr/core/src/test/org/apache/solr/cloud/SSLMigrationTest.java
===
--- solr/core/src/test/org/apache/solr/cloud/SSLMigrationTest.java  
(revision 1679199)
+++ solr/core/src/test/org/apache/solr/cloud/SSLMigrationTest.java  
(working copy)
@@ -103,7 +103,7 @@
 assertEquals(Wrong number of replicas found, 4, replicas.size());
 for(Replica replica : replicas) {
   assertTrue(Replica didn't have the proper urlScheme in the 
ClusterState,
-  StringUtils.startsWith(replica.getStr(ZkStateReader.BASE_URL_PROP), 
urlScheme));
+  StringUtils.startsWith(replica.getStr(ZkStateReader.BASE_URL_PROP), 
urlScheme + :));
 }
   }
{noformat}

 SSLMigrationTest urlScheme isn't tested properly
 

 Key: SOLR-7540
 URL: https://issues.apache.org/jira/browse/SOLR-7540
 Project: Solr
  Issue Type: Bug
Reporter: Steve Rowe
Priority: Minor

 I noticed that {{SSLMigrationTest.assertReplicaInformation(urlScheme)}} only 
 checks that a replicas' base url *starts with* the given url scheme - since 
 the urlScheme can only be http or https, this check will always succeed 
 when the given urlScheme is http.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6450) Add simple encoded GeoPointField type to core

2015-05-13 Thread Nicholas Knize (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541831#comment-14541831
 ] 

Nicholas Knize commented on LUCENE-6450:


bq. does lucene efficiently support field types of that length?

Yes. This patch (and PackedQuadTree) uses longs for encoding 2d points. I went 
ahead and opened a separate issue [LUCENE-6480 |  
https://issues.apache.org/jira/browse/LUCENE-6480] for investigating the 3d 
case so we can carry the discussion over there.  The goal for this field is to 
provide a framework for search so all we have to worry about is trying out 
different encoding techniques.

As a side note, I'm finishing up a patch that uses precision_step for indexing 
the longs at variable resolution to take advantage of the postings list and not 
visit every term. The index will be slightly bigger but it should provide the 
foundation for faster search on large polygons and bounding boxes. I'll add 
mercator projection after to reduce precision error over large search regions 
and then switch to geo3d and benchmark.

 Add simple encoded GeoPointField type to core
 -

 Key: LUCENE-6450
 URL: https://issues.apache.org/jira/browse/LUCENE-6450
 Project: Lucene - Core
  Issue Type: New Feature
Affects Versions: Trunk, 5.x
Reporter: Nicholas Knize
Priority: Minor
 Attachments: LUCENE-6450-5x.patch, LUCENE-6450-TRUNK.patch, 
 LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch


 At the moment all spatial capabilities, including basic point based indexing 
 and querying, require the lucene-spatial module. The spatial module, designed 
 to handle all things geo, requires dependency overhead (s4j, jts) to provide 
 spatial rigor for even the most simplistic spatial search use-cases (e.g., 
 lat/lon bounding box, point in poly, distance search). This feature trims the 
 overhead by adding a new GeoPointField type to core along with 
 GeoBoundingBoxQuery and GeoPolygonQuery classes to the .search package. This 
 field is intended as a straightforward lightweight type for the most basic 
 geo point use-cases without the overhead. 
 The field uses simple bit twiddling operations (currently morton hashing) to 
 encode lat/lon into a single long term.  The queries leverage simple 
 multi-phase filtering that starts by leveraging NumericRangeQuery to reduce 
 candidate terms deferring the more expensive mathematics to the smaller 
 candidate sets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7539) Add a QueryAutofilteringComponent for query introspection using indexed metadata

2015-05-13 Thread Ted Sullivan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Sullivan updated SOLR-7539:
---
Fix Version/s: Trunk

 Add a QueryAutofilteringComponent for query introspection using indexed 
 metadata
 

 Key: SOLR-7539
 URL: https://issues.apache.org/jira/browse/SOLR-7539
 Project: Solr
  Issue Type: New Feature
Reporter: Ted Sullivan
Priority: Minor
 Fix For: Trunk

 Attachments: SOLR-7539.patch


 The Query Autofiltering Component provides a method of inferring user intent 
 by matching noun phrases that are typically used for faceted-navigation into 
 Solr filter or boost queries (depending on configuration settings) so that 
 more precise user queries can be met with more precise results.
 The algorithm uses a longest contiguous phrase match strategy which allows 
 it to disambiguate queries where single terms are ambiguous but phrases are 
 not. It will work when there is structured information in the form of String 
 fields that are normally used for faceted navigation. It works across fields 
 by building a map of search term to index field using the Lucene FieldCache 
 (UninvertingReader). This enables users to create free text, multi-term 
 queries that combine attributes across facet fields - as if they had searched 
 and then navigated through several facet layers. To address the problem of 
 exact-match only semantics of String fields, support for synonyms (including 
 multi-term synonyms) and stemming was added. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5750) Backup/Restore API for SolrCloud

2015-05-13 Thread Varun Thacker (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-5750:

Attachment: SOLR-5750.patch

First pass at the feature.

*BACKUP:*
Required params - collection, name, location

Example API: 
{{/admin/collections?action=backupname=my_backuplocation=/my_locationcollection=techproducts}}

It will create a backup directory called my_location inside which it will store 
the following -

/my_location
 /my_backup
  /shard1
  /shard2
  /zk_backup
   /zk_backup/configs/configName ( The config which was being used for the 
backup collection )
   /zk_backup/collection_state.json ( Always store the cluster state for that 
collection in collection_state.json )
   /backup.properties ( Metadata about the backup )

If you have setup any aliases or roles or any other special property then that 
will not be backed up. That might not be that useful to restore as the it could 
be restored in some other cluster. We can add it later if its required.

*BACKUPSTATUS:*
Required params - name

Example API: {{/admin/collections?action=backupstatusname=my_backup}}

*RESTORE:*
Required params - collection, name, location

Example API: 
{{/admin/collections?action=restorename=my_backuplocation=/my_locationcollection=techproducts_restored}}

You can't restore into an existing collection. Provide a collection name where 
you want to restore the index into. The restore process will create the 
collection similar to the backed up collection and restore the indexes.

Restoring in the same collection would be simple to add. But in that case we 
should only restore the indexes.

{{RESTORESTATUS:}}
Required params - name

Example API: {{/admin/collections?action=restorestatusname=my_backup}}

Would appreciate a review on this. I'll work on adding more tests

 Backup/Restore API for SolrCloud
 

 Key: SOLR-5750
 URL: https://issues.apache.org/jira/browse/SOLR-5750
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Shalin Shekhar Mangar
Assignee: Varun Thacker
 Fix For: Trunk, 5.2

 Attachments: SOLR-5750.patch


 We should have an easy way to do backups and restores in SolrCloud. The 
 ReplicationHandler supports a backup command which can create snapshots of 
 the index but that is too little.
 The command should be able to backup:
 # Snapshots of all indexes or indexes from the leader or the shards
 # Config set
 # Cluster state
 # Cluster properties
 # Aliases
 # Overseer work queue?
 A restore should be able to completely restore the cloud i.e. no manual steps 
 required other than bringing nodes back up or setting up a new cloud cluster.
 SOLR-5340 will be a part of this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7541) CollectionsHandler#createNodeIfNotExists is a duplicate of ZkCmdExecutor#ensureExists

2015-05-13 Thread Varun Thacker (JIRA)
Varun Thacker created SOLR-7541:
---

 Summary: CollectionsHandler#createNodeIfNotExists is a duplicate 
of ZkCmdExecutor#ensureExists
 Key: SOLR-7541
 URL: https://issues.apache.org/jira/browse/SOLR-7541
 Project: Solr
  Issue Type: Improvement
Reporter: Varun Thacker
Priority: Minor


Looks like CollectionsHandler#createNodeIfNotExists is a duplicate of 
ZkCmdExecutor#ensureExists . Both do the same thing so we could remove 
CollectionsHandler#createNodeIfNotExists.

Also looking at {{ZkCmdExecutor#ensureExists(final String path, final byte[] 
data,CreateMode createMode, final SolrZkClient zkClient)}} the createMode 
parameter is getting discarded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6480) Extend Simple GeoPointField Type to 3d

2015-05-13 Thread Nicholas Knize (JIRA)
Nicholas Knize created LUCENE-6480:
--

 Summary: Extend Simple GeoPointField Type to 3d 
 Key: LUCENE-6480
 URL: https://issues.apache.org/jira/browse/LUCENE-6480
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Nicholas Knize


[LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] proposes a 
simple GeoPointField type to lucene core. This field uses 64bit encoding of 2 
dimensional points to construct sorted term representations of GeoPoints (aka: 
GeoHashing).

This feature investigates adding support for encoding 3 dimensional GeoPoints, 
either by extending GeoPointField to a Geo3DPointField or adding an additional 
3d constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6480) Extend Simple GeoPointField Type to 3d

2015-05-13 Thread Nicholas Knize (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541933#comment-14541933
 ] 

Nicholas Knize commented on LUCENE-6480:


Its sounds much like the simple morton interleaving I'm using for the 2D case? 
But since you're using an extra bit for the 3rd dimension you lose precision in 
the horizontal direction. We could start w/ that as a phase one? Instead of 
worrying about the sign bit the values in the 2D case are scaled 0:360, 0:180 
and divided into 32 bits per lat/lon (see GeoUtils.java).  Extending to 3D 
divide 0:360, 0:180, 0:?? by 21 and extend BitUtil.interleave to the 3 value 
case.  Its super fast since its done by bit twiddling using magic numbers 
(although the magic numbers will need to be reworked).  The question is the max 
value of the altitude? The larger the value the less precise, but you could 
conceivably go as far as 3,300 (km) to cover the earth's atmosphere?  Maybe 
that's configurable.

As a phase 2 there has been some work in this area for 3 and 4d hilbert order 
(still using 64 bit), which will better preserve locality. (I mentioned it in a 
comment in the previous issue).  Locality is important since it will drive the 
complexity of the range search and how much the postings list will actually 
help (e.g. stepping one unit in the 3rd dimension can result in a boundary 
range that requires post-filtering a significant number of high precision 
terms).

The more I think about it, this might be efficiently done using a statically 
computed lookup table (we'd have to tinker)?  i.e., one hilbert order for the 
3d unit cube is 000, 001, 101, 100, 110, 111, 011, 010, and the order of the 
suboctants at each succeeding level is a permutation of this base unit cube.  
For example, the next rotated level (for suboctant 000) gives the binary order: 
 000 000, 000 010, 000 110, 000 100, 000 101, 000 111, 000 011, 000 001.  
There's a paper that describes how to compute the suboctant permutation rather 
efficiently, and it could be statically computed and represented using 1. base 
unit ordering, 2. substitution list.  So for level 2, each suboctant ordering 
is: base order (000, 001, 101, 100, 110, 111, 011, 010), substitution list (2 
8) (3 5), (2 8 4) (3 7 5), (2 8 4) (3 7 5), (1 3) (2 4) (5 7) (6 8), (1 3) (2 
4) (5 7) (6 8), (1 5 7) (2 4 6), (1 7) (4 6).  Something to think about as an 
enhancement.  I'll try to find the paper.

 Extend Simple GeoPointField Type to 3d 
 ---

 Key: LUCENE-6480
 URL: https://issues.apache.org/jira/browse/LUCENE-6480
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Nicholas Knize

 [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] proposes a 
 simple GeoPointField type to lucene core. This field uses 64bit encoding of 2 
 dimensional points to construct sorted term representations of GeoPoints 
 (aka: GeoHashing).
 This feature investigates adding support for encoding 3 dimensional 
 GeoPoints, either by extending GeoPointField to a Geo3DPointField or adding 
 an additional 3d constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 3111 - Failure

2015-05-13 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/3111/

1 tests failed.
REGRESSION:  org.apache.solr.client.solrj.TestLBHttpSolrClient.testReliability

Error Message:
No live SolrServers available to handle this request

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this request
at 
__randomizedtesting.SeedInfo.seed([7DD14AAD805F40C:C615C9EC796325A5]:0)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:576)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:943)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:958)
at 
org.apache.solr.client.solrj.TestLBHttpSolrClient.testReliability(TestLBHttpSolrClient.java:219)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at java.lang.Thread.run(Thread.java:745)
Caused by: 

[jira] [Comment Edited] (LUCENE-6480) Extend Simple GeoPointField Type to 3d

2015-05-13 Thread Nicholas Knize (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541933#comment-14541933
 ] 

Nicholas Knize edited comment on LUCENE-6480 at 5/13/15 1:43 PM:
-

Its sounds much like the simple morton interleaving I'm using for the 2D case? 
But since you're using an extra bit for the 3rd dimension you lose precision in 
the horizontal direction. We could start w/ that as a phase one? Instead of 
worrying about the sign bit the values in the 2D case are scaled 0:360, 0:180 
and divided into 32 bits per lat/lon (see GeoUtils.java).  Extending to 3D 
divide 0:360, 0:180, 0:?? by 21 and extend BitUtil.interleave to the 3 value 
case.  Its super fast since its done by bit twiddling using magic numbers 
(although the magic numbers will need to be reworked).  The question is the max 
value of the altitude? The larger the value the less precise, but you could 
conceivably go as far as 3,300 (km) to cover the earth's atmosphere?  Maybe 
that's configurable.

As a phase 2 there has been some work in this area for 3 and 4d hilbert order 
(still using 64 bit), which will better preserve locality. (I mentioned it in a 
comment in the previous issue).  Locality is important since it will drive the 
complexity of the range search and how much the postings list will actually 
help (e.g. stepping one unit in the 3rd dimension can result in a boundary 
range that requires post-filtering a significant number of high precision 
terms).

The more I think about it, this might be efficiently done using a statically 
computed lookup table (we'd have to tinker)?  i.e., one hilbert order for the 
3d unit cube is 000, 001, 101, 100, 110, 111, 011, 010, and the order of the 
suboctants at each succeeding level is a permutation of this base unit cube.  
For example, the next rotated level (for suboctant 000) gives the binary order: 
 000 000, 000 010, 000 110, 000 100, 000 101, 000 111, 000 011, 000 001.  
There's a paper that describes how to compute the suboctant permutation rather 
efficiently, and it could be statically computed and represented using 1. base 
unit ordering, 2. substitution list.  So for level 2, each suboctant ordering 
is: base order (000, 001, 101, 100, 110, 111, 011, 010), substitution list (2 
8) (3 5), (2 8 4) (3 7 5), (2 8 4) (3 7 5), (1 3) (2 4) (5 7) (6 8), (1 3) (2 
4) (5 7) (6 8), (1 5 7) (2 4 6), (1 7) (4 6).  Something to think about as an 
enhancement.  I'll try to find the paper as I've got this worked out in my 
notebook from some previous work (lol).


was (Author: nknize):
Its sounds much like the simple morton interleaving I'm using for the 2D case? 
But since you're using an extra bit for the 3rd dimension you lose precision in 
the horizontal direction. We could start w/ that as a phase one? Instead of 
worrying about the sign bit the values in the 2D case are scaled 0:360, 0:180 
and divided into 32 bits per lat/lon (see GeoUtils.java).  Extending to 3D 
divide 0:360, 0:180, 0:?? by 21 and extend BitUtil.interleave to the 3 value 
case.  Its super fast since its done by bit twiddling using magic numbers 
(although the magic numbers will need to be reworked).  The question is the max 
value of the altitude? The larger the value the less precise, but you could 
conceivably go as far as 3,300 (km) to cover the earth's atmosphere?  Maybe 
that's configurable.

As a phase 2 there has been some work in this area for 3 and 4d hilbert order 
(still using 64 bit), which will better preserve locality. (I mentioned it in a 
comment in the previous issue).  Locality is important since it will drive the 
complexity of the range search and how much the postings list will actually 
help (e.g. stepping one unit in the 3rd dimension can result in a boundary 
range that requires post-filtering a significant number of high precision 
terms).

The more I think about it, this might be efficiently done using a statically 
computed lookup table (we'd have to tinker)?  i.e., one hilbert order for the 
3d unit cube is 000, 001, 101, 100, 110, 111, 011, 010, and the order of the 
suboctants at each succeeding level is a permutation of this base unit cube.  
For example, the next rotated level (for suboctant 000) gives the binary order: 
 000 000, 000 010, 000 110, 000 100, 000 101, 000 111, 000 011, 000 001.  
There's a paper that describes how to compute the suboctant permutation rather 
efficiently, and it could be statically computed and represented using 1. base 
unit ordering, 2. substitution list.  So for level 2, each suboctant ordering 
is: base order (000, 001, 101, 100, 110, 111, 011, 010), substitution list (2 
8) (3 5), (2 8 4) (3 7 5), (2 8 4) (3 7 5), (1 3) (2 4) (5 7) (6 8), (1 3) (2 
4) (5 7) (6 8), (1 5 7) (2 4 6), (1 7) (4 6).  Something to think about as an 
enhancement.  I'll try to find the paper.

 Extend Simple GeoPointField Type to 3d 
 

[jira] [Updated] (SOLR-7540) SSLMigrationTest urlScheme isn't tested properly

2015-05-13 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-7540:
-
Description: I noticed that 
{{SSLMigrationTest.assertReplicaInformation(urlScheme)}} only checks that 
replicas' base url *starts with* the given url scheme - since the urlScheme can 
only be http or https, this check will always succeed when the given 
urlScheme is http.  (was: I noticed that 
{{SSLMigrationTest.assertReplicaInformation(urlScheme)}} only checks that a 
replicas' base url *starts with* the given url scheme - since the urlScheme can 
only be http or https, this check will always succeed when the given 
urlScheme is http.)

 SSLMigrationTest urlScheme isn't tested properly
 

 Key: SOLR-7540
 URL: https://issues.apache.org/jira/browse/SOLR-7540
 Project: Solr
  Issue Type: Bug
Reporter: Steve Rowe
Priority: Minor

 I noticed that {{SSLMigrationTest.assertReplicaInformation(urlScheme)}} only 
 checks that replicas' base url *starts with* the given url scheme - since the 
 urlScheme can only be http or https, this check will always succeed when 
 the given urlScheme is http.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6480) Extend Simple GeoPointField Type to 3d

2015-05-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541852#comment-14541852
 ] 

Karl Wright commented on LUCENE-6480:
-

I'll attach code snippets for packing and unpacking ASAP, but it may not be 
until this weekend.


 Extend Simple GeoPointField Type to 3d 
 ---

 Key: LUCENE-6480
 URL: https://issues.apache.org/jira/browse/LUCENE-6480
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Nicholas Knize

 [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] proposes a 
 simple GeoPointField type to lucene core. This field uses 64bit encoding of 2 
 dimensional points to construct sorted term representations of GeoPoints 
 (aka: GeoHashing).
 This feature investigates adding support for encoding 3 dimensional 
 GeoPoints, either by extending GeoPointField to a Geo3DPointField or adding 
 an additional 3d constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6213) StackOverflowException in Solr cloud's leader election

2015-05-13 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541958#comment-14541958
 ] 

Mark Miller commented on SOLR-6213:
---

 Of course it should be improved.

 StackOverflowException in Solr cloud's leader election
 --

 Key: SOLR-6213
 URL: https://issues.apache.org/jira/browse/SOLR-6213
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10, Trunk
Reporter: Dawid Weiss
Priority: Critical
 Attachments: stackoverflow.txt


 This is what's causing test hangs (at least on FreeBSD, LUCENE-5786), 
 possibly on other machines too. The problem is stack overflow from looped 
 calls in:
 {code}
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)

[jira] [Comment Edited] (LUCENE-6450) Add simple encoded GeoPointField type to core

2015-05-13 Thread Nicholas Knize (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541960#comment-14541960
 ] 

Nicholas Knize edited comment on LUCENE-6450 at 5/13/15 2:12 PM:
-

yes yes!  That's the idea anyway.  I've tinkered with this a bit already.  It 
took the same amount of time to build the Automaton as it did the ranges (no 
surprises since it used the same logic to union binaryIntervals) but queries 
were on the order of 10x slower (0.8sec/query on 60M points).  Thinking maybe 
there's some optimization to the automaton that needs to be done?  I figured 
first make progress here and post a separate issue for the automaton WIP.


was (Author: nknize):
yes yes!  That's the idea anyway.  I've tinkered with this a bit already.  It 
took the same amount of time to build the Automaton as it did the ranges (no 
surprises since it used the same logic) but queries were on the order of 10x 
slower (0.8sec/query on 60M points).  Thinking maybe there's some optimization 
to the automaton that needs to be done?  I figured first make progress here and 
post a separate issue for the automaton WIP.

 Add simple encoded GeoPointField type to core
 -

 Key: LUCENE-6450
 URL: https://issues.apache.org/jira/browse/LUCENE-6450
 Project: Lucene - Core
  Issue Type: New Feature
Affects Versions: Trunk, 5.x
Reporter: Nicholas Knize
Priority: Minor
 Attachments: LUCENE-6450-5x.patch, LUCENE-6450-TRUNK.patch, 
 LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch


 At the moment all spatial capabilities, including basic point based indexing 
 and querying, require the lucene-spatial module. The spatial module, designed 
 to handle all things geo, requires dependency overhead (s4j, jts) to provide 
 spatial rigor for even the most simplistic spatial search use-cases (e.g., 
 lat/lon bounding box, point in poly, distance search). This feature trims the 
 overhead by adding a new GeoPointField type to core along with 
 GeoBoundingBoxQuery and GeoPolygonQuery classes to the .search package. This 
 field is intended as a straightforward lightweight type for the most basic 
 geo point use-cases without the overhead. 
 The field uses simple bit twiddling operations (currently morton hashing) to 
 encode lat/lon into a single long term.  The queries leverage simple 
 multi-phase filtering that starts by leveraging NumericRangeQuery to reduce 
 candidate terms deferring the more expensive mathematics to the smaller 
 candidate sets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-6480) Extend Simple GeoPointField Type to 3d

2015-05-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542001#comment-14542001
 ] 

Karl Wright edited comment on LUCENE-6480 at 5/13/15 2:28 PM:
--

bq. The question is the max value of the altitude?

To clarify, geo3d is not using (lat,lon,altitude) tuples.  When I said (x,y,z) 
I meant unit sphere (x,y,z), where z = sin(lat), x = cos(lat)**cos(lon), y = 
cos(lat)**sin(lon).  The reason you'd want to pack (x,y,z) instead of just 
(lat,lon) is that computing cosines and sines is quite expensive, so you don't 
want to be constructing a geo3d.GeoPoint using lat/lon at document scoring 
time.  Instead you'd want to unpack the (x,y,z) values directly from the 
Geo3DPointField. The range of *all three* parameters in this case is -1 to 1, 
which is how I came up with the packing resolution I did.

bq. Locality is important since it will drive the complexity of the range 
search and how much the postings list will actually help

The reason you need (x,y,z) instead of (lat,lon) at scoring time is because 
geo3d determines whether a point is within the shape using math that requires 
points to be in that form.  If you do that, then the evaluation of membership 
is blindingly fast.  The splitting proposal does have locality.





was (Author: kwri...@metacarta.com):
bq. The question is the max value of the altitude?

To clarify, geo3d is not using (lat,lon,altitude) tuples.  When I said (x,y,z) 
I meant unit sphere (x,y,z), where z = sin(lat), x = cos(lat)*cos(lon), y = 
cos(lat)*sin(lon).  The reason you'd want to pack (x,y,z) instead of just 
(lat,lon) is that computing cosines and sines is quite expensive, so you don't 
want to be constructing a geo3d.GeoPoint using lat/lon at document scoring 
time.  Instead you'd want to unpack the (x,y,z) values directly from the 
Geo3DPointField. The range of *all three* parameters in this case is -1 to 1, 
which is how I came up with the packing resolution I did.

bq. Locality is important since it will drive the complexity of the range 
search and how much the postings list will actually help

The reason you need (x,y,z) instead of (lat,lon) at scoring time is because 
geo3d determines whether a point is within the shape using math that requires 
points to be in that form.  If you do that, then the evaluation of membership 
is blindingly fast.  The splitting proposal does have locality.




 Extend Simple GeoPointField Type to 3d 
 ---

 Key: LUCENE-6480
 URL: https://issues.apache.org/jira/browse/LUCENE-6480
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Nicholas Knize

 [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] proposes a 
 simple GeoPointField type to lucene core. This field uses 64bit encoding of 2 
 dimensional points to construct sorted term representations of GeoPoints 
 (aka: GeoHashing).
 This feature investigates adding support for encoding 3 dimensional 
 GeoPoints, either by extending GeoPointField to a Geo3DPointField or adding 
 an additional 3d constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6480) Extend Simple GeoPointField Type to 3d

2015-05-13 Thread Nicholas Knize (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542050#comment-14542050
 ] 

Nicholas Knize commented on LUCENE-6480:


bq. ...when I said (x,y,z) I meant unit sphere (x,y,z).

Ah, yes reprojecting would be the right way. So why not just use ECEF then 
instead of the unit sphere? Its a better approximation of the earth. Or have 
you tried this and the few extra trig computations impaired performance?  Could 
try SloppyMath in that case and evaluate the performance/precision trade off?

bq. ...you are basically using recursive descent, intersecting with the 
ordering in the posting list..

No. Using the terms dictionary and only checking high precision terms for 
boundary ranges and using the postings list for lower resolution terms 
completely contained.

bq.  Is membership of a point within the shape sufficient?

Core geo search is meant for simple use cases, points only, contains only.  In 
that case, if a point is contained by a query bbox or polygon it is added to 
the result set. Anything more advanced than this (e.g., DE9IM) is intended for 
the shape module.

 Extend Simple GeoPointField Type to 3d 
 ---

 Key: LUCENE-6480
 URL: https://issues.apache.org/jira/browse/LUCENE-6480
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Nicholas Knize

 [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] proposes a 
 simple GeoPointField type to lucene core. This field uses 64bit encoding of 2 
 dimensional points to construct sorted term representations of GeoPoints 
 (aka: GeoHashing).
 This feature investigates adding support for encoding 3 dimensional 
 GeoPoints, either by extending GeoPointField to a Geo3DPointField or adding 
 an additional 3d constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6480) Extend Simple GeoPointField Type to 3d

2015-05-13 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542081#comment-14542081
 ] 

Michael McCandless commented on LUCENE-6480:


I'm not following closely here (yet!) but just wanted to say: we shouldn't feel 
like we must use at most 8 bytes to encode lat+lon+altitude, since we are 
indexing into arbitrary byte[] terms in the postings ...

I mean, the fewer bytes the better, but there's not a hard limit of 8.

 Extend Simple GeoPointField Type to 3d 
 ---

 Key: LUCENE-6480
 URL: https://issues.apache.org/jira/browse/LUCENE-6480
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Nicholas Knize

 [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] proposes a 
 simple GeoPointField type to lucene core. This field uses 64bit encoding of 2 
 dimensional points to construct sorted term representations of GeoPoints 
 (aka: GeoHashing).
 This feature investigates adding support for encoding 3 dimensional 
 GeoPoints, either by extending GeoPointField to a Geo3DPointField or adding 
 an additional 3d constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6450) Add simple encoded GeoPointField type to core

2015-05-13 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541948#comment-14541948
 ] 

David Smiley commented on LUCENE-6450:
--

bq. As a side note, I'm finishing up a patch that uses precision_step for 
indexing the longs at variable resolution to take advantage of the postings 
list and not visit every term. The index will be slightly bigger but it should 
provide the foundation for faster search on large polygons and bounding boxes.

If I'm not mistaken, the term auto-prefixing that Mike worked on means we need 
not do that here, especially just for point data; no?

 Add simple encoded GeoPointField type to core
 -

 Key: LUCENE-6450
 URL: https://issues.apache.org/jira/browse/LUCENE-6450
 Project: Lucene - Core
  Issue Type: New Feature
Affects Versions: Trunk, 5.x
Reporter: Nicholas Knize
Priority: Minor
 Attachments: LUCENE-6450-5x.patch, LUCENE-6450-TRUNK.patch, 
 LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch


 At the moment all spatial capabilities, including basic point based indexing 
 and querying, require the lucene-spatial module. The spatial module, designed 
 to handle all things geo, requires dependency overhead (s4j, jts) to provide 
 spatial rigor for even the most simplistic spatial search use-cases (e.g., 
 lat/lon bounding box, point in poly, distance search). This feature trims the 
 overhead by adding a new GeoPointField type to core along with 
 GeoBoundingBoxQuery and GeoPolygonQuery classes to the .search package. This 
 field is intended as a straightforward lightweight type for the most basic 
 geo point use-cases without the overhead. 
 The field uses simple bit twiddling operations (currently morton hashing) to 
 encode lat/lon into a single long term.  The queries leverage simple 
 multi-phase filtering that starts by leveraging NumericRangeQuery to reduce 
 candidate terms deferring the more expensive mathematics to the smaller 
 candidate sets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6480) Extend Simple GeoPointField Type to 3d

2015-05-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542001#comment-14542001
 ] 

Karl Wright commented on LUCENE-6480:
-

bq. The question is the max value of the altitude?

To clarify, geo3d is not using (lat,lon,altitude) tuples.  When I said (x,y,z) 
I meant unit sphere (x,y,z), where z = sin(lat), x = cos(lat)*cos(lon), y = 
cos(lat)*sin(lon).  The reason you'd want to pack (x,y,z) instead of just 
(lat,lon) is that computing cosines and sines is quite expensive, so you don't 
want to be constructing a geo3d.GeoPoint using lat/lon at document scoring 
time.  Instead you'd want to unpack the (x,y,z) values directly from the 
Geo3DPointField. The range of *all three* parameters in this case is -1 to 1, 
which is how I came up with the packing resolution I did.

bq. Locality is important since it will drive the complexity of the range 
search and how much the postings list will actually help

The reason you need (x,y,z) instead of (lat,lon) at scoring time is because 
geo3d determines whether a point is within the shape using math that requires 
points to be in that form.  If you do that, then the evaluation of membership 
is blindingly fast.  The splitting proposal does have locality.




 Extend Simple GeoPointField Type to 3d 
 ---

 Key: LUCENE-6480
 URL: https://issues.apache.org/jira/browse/LUCENE-6480
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Nicholas Knize

 [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] proposes a 
 simple GeoPointField type to lucene core. This field uses 64bit encoding of 2 
 dimensional points to construct sorted term representations of GeoPoints 
 (aka: GeoHashing).
 This feature investigates adding support for encoding 3 dimensional 
 GeoPoints, either by extending GeoPointField to a Geo3DPointField or adding 
 an additional 3d constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7542) Schema API: Can't remove single dynamic copy field directive

2015-05-13 Thread Steve Rowe (JIRA)
Steve Rowe created SOLR-7542:


 Summary: Schema API: Can't remove single dynamic copy field 
directive
 Key: SOLR-7542
 URL: https://issues.apache.org/jira/browse/SOLR-7542
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.1
Reporter: Steve Rowe
 Fix For: 5.2


In a managed schema containing just a single dynamic copy field directive - 
i.e. a glob source or destination - deleting the copy field directive fails.  
For example, the default configset (data_driven_schema_configs) has such a 
schema: the {{*}}-{{\_text\_}} copy field directive is the only one. 

To reproduce:

{noformat}
bin/solr start -c
bin/solr create my_solr_coll
curl http://localhost:8983/solr/my_solr_coll/schema; 
-d'{delete-copy-field:{source:*, dest:_text_}}'
{noformat}

The deletion fails, and an NPE is logged: 

{noformat}
ERROR - 2015-05-13 12:37:36.780; [my_solr_coll shard1 core_node1 
my_solr_coll_shard1_replica1] org.apache.solr.common.SolrException; 
null:java.lang.NullPointerException
at 
org.apache.solr.schema.IndexSchema.getCopyFieldProperties(IndexSchema.java:1450)
at 
org.apache.solr.schema.IndexSchema.getNamedPropertyValues(IndexSchema.java:1406)
at org.apache.solr.schema.IndexSchema.persist(IndexSchema.java:390)
at 
org.apache.solr.schema.SchemaManager.doOperations(SchemaManager.java:120)
at 
org.apache.solr.schema.SchemaManager.performOperations(SchemaManager.java:94)
at 
org.apache.solr.handler.SchemaHandler.handleRequestBody(SchemaHandler.java:57)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
[...]
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-7542) Schema API: Can't remove single dynamic copy field directive

2015-05-13 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe reassigned SOLR-7542:


Assignee: Steve Rowe

 Schema API: Can't remove single dynamic copy field directive
 

 Key: SOLR-7542
 URL: https://issues.apache.org/jira/browse/SOLR-7542
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.1
Reporter: Steve Rowe
Assignee: Steve Rowe
 Fix For: 5.2


 In a managed schema containing just a single dynamic copy field directive - 
 i.e. a glob source or destination - deleting the copy field directive fails.  
 For example, the default configset (data_driven_schema_configs) has such a 
 schema: the {{*}}-{{\_text\_}} copy field directive is the only one. 
 To reproduce:
 {noformat}
 bin/solr start -c
 bin/solr create my_solr_coll
 curl http://localhost:8983/solr/my_solr_coll/schema; 
 -d'{delete-copy-field:{source:*, dest:_text_}}'
 {noformat}
 The deletion fails, and an NPE is logged: 
 {noformat}
 ERROR - 2015-05-13 12:37:36.780; [my_solr_coll shard1 core_node1 
 my_solr_coll_shard1_replica1] org.apache.solr.common.SolrException; 
 null:java.lang.NullPointerException
 at 
 org.apache.solr.schema.IndexSchema.getCopyFieldProperties(IndexSchema.java:1450)
 at 
 org.apache.solr.schema.IndexSchema.getNamedPropertyValues(IndexSchema.java:1406)
 at org.apache.solr.schema.IndexSchema.persist(IndexSchema.java:390)
 at 
 org.apache.solr.schema.SchemaManager.doOperations(SchemaManager.java:120)
 at 
 org.apache.solr.schema.SchemaManager.performOperations(SchemaManager.java:94)
 at 
 org.apache.solr.handler.SchemaHandler.handleRequestBody(SchemaHandler.java:57)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
 [...]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6450) Add simple encoded GeoPointField type to core

2015-05-13 Thread Nicholas Knize (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541960#comment-14541960
 ] 

Nicholas Knize commented on LUCENE-6450:


yes yes!  That's the idea anyway.  I've tinkered with this a bit already.  It 
took the same amount of time to build the Automaton as it did the ranges (no 
surprises since it used the same logic) but queries were on the order of 10x 
slower (0.8sec/query on 60M points).  Thinking maybe there's some optimization 
to the automaton that needs to be done?  I figured first make progress here and 
post a separate issue for the automaton WIP.

 Add simple encoded GeoPointField type to core
 -

 Key: LUCENE-6450
 URL: https://issues.apache.org/jira/browse/LUCENE-6450
 Project: Lucene - Core
  Issue Type: New Feature
Affects Versions: Trunk, 5.x
Reporter: Nicholas Knize
Priority: Minor
 Attachments: LUCENE-6450-5x.patch, LUCENE-6450-TRUNK.patch, 
 LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch


 At the moment all spatial capabilities, including basic point based indexing 
 and querying, require the lucene-spatial module. The spatial module, designed 
 to handle all things geo, requires dependency overhead (s4j, jts) to provide 
 spatial rigor for even the most simplistic spatial search use-cases (e.g., 
 lat/lon bounding box, point in poly, distance search). This feature trims the 
 overhead by adding a new GeoPointField type to core along with 
 GeoBoundingBoxQuery and GeoPolygonQuery classes to the .search package. This 
 field is intended as a straightforward lightweight type for the most basic 
 geo point use-cases without the overhead. 
 The field uses simple bit twiddling operations (currently morton hashing) to 
 encode lat/lon into a single long term.  The queries leverage simple 
 multi-phase filtering that starts by leveraging NumericRangeQuery to reduce 
 candidate terms deferring the more expensive mathematics to the smaller 
 candidate sets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-6480) Extend Simple GeoPointField Type to 3d

2015-05-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542001#comment-14542001
 ] 

Karl Wright edited comment on LUCENE-6480 at 5/13/15 2:29 PM:
--

bq. The question is the max value of the altitude?

To clarify, geo3d is not using (lat,lon,altitude) tuples.  When I said (x,y,z) 
I meant unit sphere (x,y,z), where z = sin(lat), x = cos(lat)cos(lon), y = 
cos(lat)sin(lon).  The reason you'd want to pack (x,y,z) instead of just 
(lat,lon) is that computing cosines and sines is quite expensive, so you don't 
want to be constructing a geo3d.GeoPoint using lat/lon at document scoring 
time.  Instead you'd want to unpack the (x,y,z) values directly from the 
Geo3DPointField. The range of *all three* parameters in this case is -1 to 1, 
which is how I came up with the packing resolution I did.

bq. Locality is important since it will drive the complexity of the range 
search and how much the postings list will actually help

The reason you need (x,y,z) instead of (lat,lon) at scoring time is because 
geo3d determines whether a point is within the shape using math that requires 
points to be in that form.  If you do that, then the evaluation of membership 
is blindingly fast.  The splitting proposal does have locality.





was (Author: kwri...@metacarta.com):
bq. The question is the max value of the altitude?

To clarify, geo3d is not using (lat,lon,altitude) tuples.  When I said (x,y,z) 
I meant unit sphere (x,y,z), where z = sin(lat), x = cos(lat)**cos(lon), y = 
cos(lat)**sin(lon).  The reason you'd want to pack (x,y,z) instead of just 
(lat,lon) is that computing cosines and sines is quite expensive, so you don't 
want to be constructing a geo3d.GeoPoint using lat/lon at document scoring 
time.  Instead you'd want to unpack the (x,y,z) values directly from the 
Geo3DPointField. The range of *all three* parameters in this case is -1 to 1, 
which is how I came up with the packing resolution I did.

bq. Locality is important since it will drive the complexity of the range 
search and how much the postings list will actually help

The reason you need (x,y,z) instead of (lat,lon) at scoring time is because 
geo3d determines whether a point is within the shape using math that requires 
points to be in that form.  If you do that, then the evaluation of membership 
is blindingly fast.  The splitting proposal does have locality.




 Extend Simple GeoPointField Type to 3d 
 ---

 Key: LUCENE-6480
 URL: https://issues.apache.org/jira/browse/LUCENE-6480
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Nicholas Knize

 [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] proposes a 
 simple GeoPointField type to lucene core. This field uses 64bit encoding of 2 
 dimensional points to construct sorted term representations of GeoPoints 
 (aka: GeoHashing).
 This feature investigates adding support for encoding 3 dimensional 
 GeoPoints, either by extending GeoPointField to a Geo3DPointField or adding 
 an additional 3d constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6480) Extend Simple GeoPointField Type to 3d

2015-05-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542019#comment-14542019
 ] 

Karl Wright commented on LUCENE-6480:
-

It also occurs to me that I don't fully understand how you intend to perform a 
fast search involving geo3d for records that are within a specified geo3d 
shape.  Perhaps you could clarify in general terms how you would foresee doing 
that?  I get that you are basically using recursive descent, intersecting with 
the ordering in the posting list, but then I get fuzzy.  What kind of boolean 
decisions need to be made?  Is membership of a point within the shape 
sufficient?  Point me at the technique as written up elsewhere if you like...

 Extend Simple GeoPointField Type to 3d 
 ---

 Key: LUCENE-6480
 URL: https://issues.apache.org/jira/browse/LUCENE-6480
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Nicholas Knize

 [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] proposes a 
 simple GeoPointField type to lucene core. This field uses 64bit encoding of 2 
 dimensional points to construct sorted term representations of GeoPoints 
 (aka: GeoHashing).
 This feature investigates adding support for encoding 3 dimensional 
 GeoPoints, either by extending GeoPointField to a Geo3DPointField or adding 
 an additional 3d constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Recent Java 9 commit (e5b66323ae45) breaks fsync on directory

2015-05-13 Thread Brian Burkhalter
Hi Uwe,

On May 13, 2015, at 2:27 AM, Uwe Schindler uschind...@apache.org wrote:

 many thanks for opening this issue!

You’re welcome!

 I agree with Alan that adding an OpenOption would be a good possibility. In 
 any case, as Files only contains static methods, we could still add a 
 “utility” method that forces file/directory buffers to disk, that just uses 
 the new open option under the hood. By that FileSystem SPI interfaces do not 
 need to be modified and just need to take care about the new OpenOption (if 
 supported).

I started to investigate both avenues. Alan says he has some notes on previous 
work on the OpenOption avenue and I would like to see them before proceeding 
much further.

 There is one additional issue we found recently on MacOSX, but this is only 
 slightly related to the one here. It looks like on MacOSX, FileChannel#force 
 is mostly a noop regarding syncing data to disk, because the underlying 
 operating system requires a “special” fnctl to force buffers to disk device:
  
 https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/fsync.2.html:
  For applications that require tighter guarantees about the integrity of 
 their data, Mac OS X provides
  the F_FULLFSYNC fcntl.  The F_FULLFSYNC fcntl asks the drive to flush 
 all buffered data to permanent
  storage.  Applications, such as databases, that require a strict 
 ordering of writes should use F_FULLFSYNC
  to ensure that their data is written in the order they expect.  Please 
 see fcntl(2) for more
  detail.
  
 This different behavior breaks the guarantees of FileChannel#force on MacOSX 
 (as described in Javadocs). So the MacOSX FileSystemProvider implementation 
 should use this special fnctl to force file buffers to disk.


Thanks for mentioning this. I read all about the F_FULLSYNC situation yesterday 
in the OS X man pages.

 Should I open a bug report on bugs.sun.com?

I don’t think there is any need. Perhaps we can simply handle the OS X variant 
under this issue unless someone objects.

Thanks,

Brian



[jira] [Commented] (SOLR-7531) Config API is merging certain key names together

2015-05-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542083#comment-14542083
 ] 

ASF subversion and git services commented on SOLR-7531:
---

Commit 1679223 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1679223 ]

SOLR-7531: added a test

 Config API is merging certain key names together
 

 Key: SOLR-7531
 URL: https://issues.apache.org/jira/browse/SOLR-7531
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.0, 5.1
Reporter: Shalin Shekhar Mangar
Assignee: Noble Paul
 Fix For: Trunk, 5.2


 Starting from a new Solr 5.0 install
 {code}
 ./bin/solr start -e schemaless
 curl 'http://localhost:8983/solr/gettingstarted/config'  config.json
 {code}
 Open config.json and note that there is a key called autoCommmitMaxDocs 
 under the updateHandler section.
 {code}
 curl 'http://localhost:8983/solr/gettingstarted/config' -H 
 'Content-type:application/json' -d '{set-property : 
 {updateHandler.autoCommit.maxDocs : 5000}}'
 curl 'http://localhost:8983/solr/gettingstarted/config'  config.json
 {code}
 Open config.json and note that both the value of updateHandler  autoCommit  
 maxDocs and updateHandler  autoCommitMaxDocs is now set to 5000



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7542) Schema API: Can't remove single dynamic copy field directive

2015-05-13 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542039#comment-14542039
 ] 

Steve Rowe commented on SOLR-7542:
--

The issue is that the schema can't be persisted once there are no more dynamic 
copy fields (glob copy field directives).

One workaround is to first add another copy field directive (e.g. 
{{\*unlikely_field_suffix}}\-{{\_text\_}}).  The copy field directive that you 
want to remove ({{*}}-{{\_text\_}} in our example) can then be successfully 
deleted.

The fix is a null check on the internal array containing the dynamic copy 
fields.

AFAICT, this is also a problem in schemas that start out with zero dynamic copy 
fields - in that case I think it won't be possible to make any schema 
modifications at all.

 Schema API: Can't remove single dynamic copy field directive
 

 Key: SOLR-7542
 URL: https://issues.apache.org/jira/browse/SOLR-7542
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.1
Reporter: Steve Rowe
Assignee: Steve Rowe
 Fix For: 5.2


 In a managed schema containing just a single dynamic copy field directive - 
 i.e. a glob source or destination - deleting the copy field directive fails.  
 For example, the default configset (data_driven_schema_configs) has such a 
 schema: the {{*}}-{{\_text\_}} copy field directive is the only one. 
 To reproduce:
 {noformat}
 bin/solr start -c
 bin/solr create my_solr_coll
 curl http://localhost:8983/solr/my_solr_coll/schema; 
 -d'{delete-copy-field:{source:*, dest:_text_}}'
 {noformat}
 The deletion fails, and an NPE is logged: 
 {noformat}
 ERROR - 2015-05-13 12:37:36.780; [my_solr_coll shard1 core_node1 
 my_solr_coll_shard1_replica1] org.apache.solr.common.SolrException; 
 null:java.lang.NullPointerException
 at 
 org.apache.solr.schema.IndexSchema.getCopyFieldProperties(IndexSchema.java:1450)
 at 
 org.apache.solr.schema.IndexSchema.getNamedPropertyValues(IndexSchema.java:1406)
 at org.apache.solr.schema.IndexSchema.persist(IndexSchema.java:390)
 at 
 org.apache.solr.schema.SchemaManager.doOperations(SchemaManager.java:120)
 at 
 org.apache.solr.schema.SchemaManager.performOperations(SchemaManager.java:94)
 at 
 org.apache.solr.handler.SchemaHandler.handleRequestBody(SchemaHandler.java:57)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
 [...]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7531) Config API is merging certain key names together

2015-05-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542041#comment-14542041
 ] 

ASF subversion and git services commented on SOLR-7531:
---

Commit 1679221 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1679221 ]

SOLR-7531: config API shows a few keys merged together

 Config API is merging certain key names together
 

 Key: SOLR-7531
 URL: https://issues.apache.org/jira/browse/SOLR-7531
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.0, 5.1
Reporter: Shalin Shekhar Mangar
Assignee: Noble Paul
 Fix For: Trunk, 5.2


 Starting from a new Solr 5.0 install
 {code}
 ./bin/solr start -e schemaless
 curl 'http://localhost:8983/solr/gettingstarted/config'  config.json
 {code}
 Open config.json and note that there is a key called autoCommmitMaxDocs 
 under the updateHandler section.
 {code}
 curl 'http://localhost:8983/solr/gettingstarted/config' -H 
 'Content-type:application/json' -d '{set-property : 
 {updateHandler.autoCommit.maxDocs : 5000}}'
 curl 'http://localhost:8983/solr/gettingstarted/config'  config.json
 {code}
 Open config.json and note that both the value of updateHandler  autoCommit  
 maxDocs and updateHandler  autoCommitMaxDocs is now set to 5000



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7542) Schema API: Can't remove single dynamic copy field directive

2015-05-13 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-7542:
-
Attachment: SOLR-7542.patch

Patch with test that fails before applying the fix and succeeds afterward.

I've added null checks on all access to the dynamic copy fields array in 
IndexSchema.

Committing shortly.

 Schema API: Can't remove single dynamic copy field directive
 

 Key: SOLR-7542
 URL: https://issues.apache.org/jira/browse/SOLR-7542
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.1
Reporter: Steve Rowe
Assignee: Steve Rowe
 Fix For: 5.2

 Attachments: SOLR-7542.patch


 In a managed schema containing just a single dynamic copy field directive - 
 i.e. a glob source or destination - deleting the copy field directive fails.  
 For example, the default configset (data_driven_schema_configs) has such a 
 schema: the {{*}}-{{\_text\_}} copy field directive is the only one. 
 To reproduce:
 {noformat}
 bin/solr start -c
 bin/solr create my_solr_coll
 curl http://localhost:8983/solr/my_solr_coll/schema; 
 -d'{delete-copy-field:{source:*, dest:_text_}}'
 {noformat}
 The deletion fails, and an NPE is logged: 
 {noformat}
 ERROR - 2015-05-13 12:37:36.780; [my_solr_coll shard1 core_node1 
 my_solr_coll_shard1_replica1] org.apache.solr.common.SolrException; 
 null:java.lang.NullPointerException
 at 
 org.apache.solr.schema.IndexSchema.getCopyFieldProperties(IndexSchema.java:1450)
 at 
 org.apache.solr.schema.IndexSchema.getNamedPropertyValues(IndexSchema.java:1406)
 at org.apache.solr.schema.IndexSchema.persist(IndexSchema.java:390)
 at 
 org.apache.solr.schema.SchemaManager.doOperations(SchemaManager.java:120)
 at 
 org.apache.solr.schema.SchemaManager.performOperations(SchemaManager.java:94)
 at 
 org.apache.solr.handler.SchemaHandler.handleRequestBody(SchemaHandler.java:57)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
 [...]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7543) Create GraphQuery that allows graph traversal as a query operator.

2015-05-13 Thread Kevin Watters (JIRA)
Kevin Watters created SOLR-7543:
---

 Summary: Create GraphQuery that allows graph traversal as a query 
operator.
 Key: SOLR-7543
 URL: https://issues.apache.org/jira/browse/SOLR-7543
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Kevin Watters
Priority: Minor


I have a GraphQuery that I implemented a long time back that allows a user to 
specify a seedQuery to identify which documents to start graph traversal 
from.  It then gathers up the edge ids for those documents , optionally applies 
an additional filter.  The query is then re-executed continually until no new 
edge ids are identified.  I am currently hosting this code up at 
https://github.com/kwatters/solrgraph and I would like to work with the 
community to get some feedback and ultimately get it committed back in as a 
lucene query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-7275) Pluggable authorization module in Solr

2015-05-13 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542399#comment-14542399
 ] 

Anshum Gupta edited comment on SOLR-7275 at 5/13/15 6:40 PM:
-

Patch with test. I think this is good to go now.
Any feedback would be appreciated.


was (Author: anshumg):
Patch with test. I think this is good to go now.

 Pluggable authorization module in Solr
 --

 Key: SOLR-7275
 URL: https://issues.apache.org/jira/browse/SOLR-7275
 Project: Solr
  Issue Type: Sub-task
Reporter: Anshum Gupta
Assignee: Anshum Gupta
 Attachments: SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch


 Solr needs an interface that makes it easy for different authorization 
 systems to be plugged into it. Here's what I plan on doing:
 Define an interface {{SolrAuthorizationPlugin}} with one single method 
 {{isAuthorized}}. This would take in a {{SolrRequestContext}} object and 
 return an {{SolrAuthorizationResponse}} object. The object as of now would 
 only contain a single boolean value but in the future could contain more 
 information e.g. ACL for document filtering etc.
 The reason why we need a context object is so that the plugin doesn't need to 
 understand Solr's capabilities e.g. how to extract the name of the collection 
 or other information from the incoming request as there are multiple ways to 
 specify the target collection for a request. Similarly request type can be 
 specified by {{qt}} or {{/handler_name}}.
 Flow:
 Request - SolrDispatchFilter - isAuthorized(context) - Process/Return.
 {code}
 public interface SolrAuthorizationPlugin {
   public SolrAuthorizationResponse isAuthorized(SolrRequestContext context);
 }
 {code}
 {code}
 public  class SolrRequestContext {
   UserInfo; // Will contain user context from the authentication layer.
   HTTPRequest request;
   Enum OperationType; // Correlated with user roles.
   String[] CollectionsAccessed;
   String[] FieldsAccessed;
   String Resource;
 }
 {code}
 {code}
 public class SolrAuthorizationResponse {
   boolean authorized;
   public boolean isAuthorized();
 }
 {code}
 User Roles: 
 * Admin
 * Collection Level:
   * Query
   * Update
   * Admin
 Using this framework, an implementation could be written for specific 
 security systems e.g. Apache Ranger or Sentry. It would keep all the security 
 system specific code out of Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7275) Pluggable authorization module in Solr

2015-05-13 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-7275:
---
Attachment: SOLR-7275.patch

Patch with test. I think this is good to go now.

 Pluggable authorization module in Solr
 --

 Key: SOLR-7275
 URL: https://issues.apache.org/jira/browse/SOLR-7275
 Project: Solr
  Issue Type: Sub-task
Reporter: Anshum Gupta
Assignee: Anshum Gupta
 Attachments: SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch


 Solr needs an interface that makes it easy for different authorization 
 systems to be plugged into it. Here's what I plan on doing:
 Define an interface {{SolrAuthorizationPlugin}} with one single method 
 {{isAuthorized}}. This would take in a {{SolrRequestContext}} object and 
 return an {{SolrAuthorizationResponse}} object. The object as of now would 
 only contain a single boolean value but in the future could contain more 
 information e.g. ACL for document filtering etc.
 The reason why we need a context object is so that the plugin doesn't need to 
 understand Solr's capabilities e.g. how to extract the name of the collection 
 or other information from the incoming request as there are multiple ways to 
 specify the target collection for a request. Similarly request type can be 
 specified by {{qt}} or {{/handler_name}}.
 Flow:
 Request - SolrDispatchFilter - isAuthorized(context) - Process/Return.
 {code}
 public interface SolrAuthorizationPlugin {
   public SolrAuthorizationResponse isAuthorized(SolrRequestContext context);
 }
 {code}
 {code}
 public  class SolrRequestContext {
   UserInfo; // Will contain user context from the authentication layer.
   HTTPRequest request;
   Enum OperationType; // Correlated with user roles.
   String[] CollectionsAccessed;
   String[] FieldsAccessed;
   String Resource;
 }
 {code}
 {code}
 public class SolrAuthorizationResponse {
   boolean authorized;
   public boolean isAuthorized();
 }
 {code}
 User Roles: 
 * Admin
 * Collection Level:
   * Query
   * Update
   * Admin
 Using this framework, an implementation could be written for specific 
 security systems e.g. Apache Ranger or Sentry. It would keep all the security 
 system specific code out of Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6450) Add simple encoded GeoPointField type to core

2015-05-13 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542429#comment-14542429
 ] 

Uwe Schindler commented on LUCENE-6450:
---

Looks OK regarding my comments about subclassing.

One thing: could you make the fields final in the query? Query should be 
immutable, so the min/maxLat/Lon soubles and polygon array are unmodifiable.

I will have a closer look later, I just skimmed through the patch.

 Add simple encoded GeoPointField type to core
 -

 Key: LUCENE-6450
 URL: https://issues.apache.org/jira/browse/LUCENE-6450
 Project: Lucene - Core
  Issue Type: New Feature
Affects Versions: Trunk, 5.x
Reporter: Nicholas Knize
Priority: Minor
 Attachments: LUCENE-6450-5x.patch, LUCENE-6450-TRUNK.patch, 
 LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch, 
 LUCENE-6450.patch


 At the moment all spatial capabilities, including basic point based indexing 
 and querying, require the lucene-spatial module. The spatial module, designed 
 to handle all things geo, requires dependency overhead (s4j, jts) to provide 
 spatial rigor for even the most simplistic spatial search use-cases (e.g., 
 lat/lon bounding box, point in poly, distance search). This feature trims the 
 overhead by adding a new GeoPointField type to core along with 
 GeoBoundingBoxQuery and GeoPolygonQuery classes to the .search package. This 
 field is intended as a straightforward lightweight type for the most basic 
 geo point use-cases without the overhead. 
 The field uses simple bit twiddling operations (currently morton hashing) to 
 encode lat/lon into a single long term.  The queries leverage simple 
 multi-phase filtering that starts by leveraging NumericRangeQuery to reduce 
 candidate terms deferring the more expensive mathematics to the smaller 
 candidate sets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6371) Improve Spans payload collection

2015-05-13 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542474#comment-14542474
 ] 

David Smiley commented on LUCENE-6371:
--

I really like this design, because it enables one to build a highlighter that 
is accurate (so-called “query debugging”).  That’s a huge bonus I wasn’t 
expecting from this patch (based on the issue title/description).  But I think 
something is missing — SpanCollector.collectLeaf doesn’t provide access to the 
SpanQuery or perhaps the Term that is being collected.

Might SpanCollector.DEFAULT be renamed to NO_OP?  Same for 
BufferedSpanCollector.NO_OP.  I think NO_OP is more clear as to what this 
implementation does.

PayloadSpanCollector should use BytesRefArray instead of an ArrayListbyte[]; 
and it can return this from getPayloads()

What is the purpose of the start  end position params to collectLeaf()?  No 
implementation uses them (on consumer or implementer side) and I'm not sure how 
they might be used.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Priority: Minor
 Attachments: LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-7531) Config API is merging certain key names together

2015-05-13 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul resolved SOLR-7531.
--
Resolution: Pending Closed

 Config API is merging certain key names together
 

 Key: SOLR-7531
 URL: https://issues.apache.org/jira/browse/SOLR-7531
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.0, 5.1
Reporter: Shalin Shekhar Mangar
Assignee: Noble Paul
 Fix For: Trunk, 5.2


 Starting from a new Solr 5.0 install
 {code}
 ./bin/solr start -e schemaless
 curl 'http://localhost:8983/solr/gettingstarted/config'  config.json
 {code}
 Open config.json and note that there is a key called autoCommmitMaxDocs 
 under the updateHandler section.
 {code}
 curl 'http://localhost:8983/solr/gettingstarted/config' -H 
 'Content-type:application/json' -d '{set-property : 
 {updateHandler.autoCommit.maxDocs : 5000}}'
 curl 'http://localhost:8983/solr/gettingstarted/config'  config.json
 {code}
 Open config.json and note that both the value of updateHandler  autoCommit  
 maxDocs and updateHandler  autoCommitMaxDocs is now set to 5000



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-5.x-MacOSX (64bit/jdk1.8.0) - Build # 2254 - Failure!

2015-05-13 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2254/
Java: 64bit/jdk1.8.0 -XX:-UseCompressedOops -XX:+UseParallelGC

1 tests failed.
FAILED:  org.apache.solr.cloud.MultiThreadedOCPTest.test

Error Message:
Captured an uncaught exception in thread: Thread[id=1487, 
name=parallelCoreAdminExecutor-629-thread-14, state=RUNNABLE, 
group=TGRP-MultiThreadedOCPTest]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=1487, 
name=parallelCoreAdminExecutor-629-thread-14, state=RUNNABLE, 
group=TGRP-MultiThreadedOCPTest]
at 
__randomizedtesting.SeedInfo.seed([41BFB6F148D0F9A9:C9EB892BE62C9451]:0)
Caused by: java.lang.AssertionError: Too many closes on SolrCore
at __randomizedtesting.SeedInfo.seed([41BFB6F148D0F9A9]:0)
at org.apache.solr.core.SolrCore.close(SolrCore.java:1138)
at org.apache.solr.common.util.IOUtils.closeQuietly(IOUtils.java:31)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:535)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:494)
at 
org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:628)
at 
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:213)
at 
org.apache.solr.handler.admin.CoreAdminHandler$ParallelCoreAdminHandlerThread.run(CoreAdminHandler.java:1249)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:148)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 9808 lines...]
   [junit4] Suite: org.apache.solr.cloud.MultiThreadedOCPTest
   [junit4]   2 Creating dataDir: 
/Users/jenkins/workspace/Lucene-Solr-5.x-MacOSX/solr/build/solr-core/test/J1/temp/solr.cloud.MultiThreadedOCPTest
 41BFB6F148D0F9A9-001/init-core-data-001
   [junit4]   2 295271 T1169 oas.SolrTestCaseJ4.buildSSLConfig Randomized ssl 
(false) and clientAuth (false)
   [junit4]   2 295271 T1169 oas.BaseDistributedSearchTestCase.initHostContext 
Setting hostContext system property: /b/
   [junit4]   2 295274 T1169 oasc.ZkTestServer.run STARTING ZK TEST SERVER
   [junit4]   2 295275 T1170 oasc.ZkTestServer$2$1.setClientPort client 
port:0.0.0.0/0.0.0.0:0
   [junit4]   2 295275 T1170 oasc.ZkTestServer$ZKServerMain.runFromConfig 
Starting server
   [junit4]   2 295377 T1169 oasc.ZkTestServer.run start zk server on 
port:54899
   [junit4]   2 295377 T1169 
oascc.SolrZkClient.createZkCredentialsToAddAutomatically Using default 
ZkCredentialsProvider
   [junit4]   2 295379 T1169 oascc.ConnectionManager.waitForConnected Waiting 
for client to connect to ZooKeeper
   [junit4]   2 295387 T1177 oascc.ConnectionManager.process Watcher 
org.apache.solr.common.cloud.ConnectionManager@3867cbb3 
name:ZooKeeperConnection Watcher:127.0.0.1:54899 got event WatchedEvent 
state:SyncConnected type:None path:null path:null type:None
   [junit4]   2 295388 T1169 oascc.ConnectionManager.waitForConnected Client 
is connected to ZooKeeper
   [junit4]   2 295388 T1169 oascc.SolrZkClient.createZkACLProvider Using 
default ZkACLProvider
   [junit4]   2 295388 T1169 oascc.SolrZkClient.makePath makePath: /solr
   [junit4]   2 295398 T1169 
oascc.SolrZkClient.createZkCredentialsToAddAutomatically Using default 
ZkCredentialsProvider
   [junit4]   2 295400 T1169 oascc.ConnectionManager.waitForConnected Waiting 
for client to connect to ZooKeeper
   [junit4]   2 295404 T1180 oascc.ConnectionManager.process Watcher 
org.apache.solr.common.cloud.ConnectionManager@1cb44d54 
name:ZooKeeperConnection Watcher:127.0.0.1:54899/solr got event WatchedEvent 
state:SyncConnected type:None path:null path:null type:None
   [junit4]   2 295405 T1169 oascc.ConnectionManager.waitForConnected Client 
is connected to ZooKeeper
   [junit4]   2 295405 T1169 oascc.SolrZkClient.createZkACLProvider Using 
default ZkACLProvider
   [junit4]   2 295405 T1169 oascc.SolrZkClient.makePath makePath: 
/collections/collection1
   [junit4]   2 295412 T1169 oascc.SolrZkClient.makePath makePath: 
/collections/collection1/shards
   [junit4]   2 295420 T1169 oascc.SolrZkClient.makePath makePath: 
/collections/control_collection
   [junit4]   2 295425 T1169 oascc.SolrZkClient.makePath makePath: 
/collections/control_collection/shards
   [junit4]   2 295429 T1169 oasc.AbstractZkTestCase.putConfig put 
/Users/jenkins/workspace/Lucene-Solr-5.x-MacOSX/solr/core/src/test-files/solr/collection1/conf/solrconfig-tlog.xml
 to /configs/conf1/solrconfig.xml
   [junit4]   2 295429 T1169 oascc.SolrZkClient.makePath makePath: 
/configs/conf1/solrconfig.xml
   [junit4]   2 295435 T1169 oasc.AbstractZkTestCase.putConfig put 

[jira] [Updated] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms

2015-05-13 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-6481:
---
Attachment: LUCENE-6481_WIP.patch

First cut WIP patch. LuceneUtil benchmark shows false negatives, though, so 
this is definitely not ready. So far I've been unable to reproduce the false 
negatives...I put it here for iterating improvements.

*GeoPointField*

Index Time:  640.24 sec
Index Size: 4.4G
Mean Query Time:  0.02 sec

 Improve GeoPointField type to only visit high precision boundary terms 
 ---

 Key: LUCENE-6481
 URL: https://issues.apache.org/jira/browse/LUCENE-6481
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Nicholas Knize
 Attachments: LUCENE-6481_WIP.patch


 Current GeoPointField [LUCENE-6450 | 
 https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges 
 along the space-filling curve that represent a provided bounding box.  This 
 determines which terms to visit in the terms dictionary and which to skip. 
 This is suboptimal for large bounding boxes as we may end up visiting all 
 terms (which could be quite large). 
 This incremental improvement is to improve GeoPointField to only visit high 
 precision terms in boundary ranges and use the postings list for ranges that 
 are completely within the target bounding box.
 A separate improvement is to switch over to auto-prefix and build an 
 Automaton representing the bounding box.  That can be tracked in a separate 
 issue.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-7143) MoreLikeThis Query Parser does not handle multiple field names

2015-05-13 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542663#comment-14542663
 ] 

Anshum Gupta edited comment on SOLR-7143 at 5/13/15 8:40 PM:
-

Hi Jens, Sorry but I haven't been able to get to this all this while.

Here's what we need to get working:
# Way to specify multiple values for a field within the local params. e.g.:
{code:title=SOLR-2798 would solve this}
http://localhost:8983/solr/techproducts/select?q={!mlt qf=foo qf=bar}docid
{code}
# We also need to support parameter dereferencing as you suggested, considering 
we don't want to get involved with commas:
{code}
http://localhost:8983/solr/techproducts/select?q={!mlt 
qf=$mlt.fl}docidmlt.fl=foomlt.fl=bar
{code}
Supporting comma's would interfere with the syntax used for things like bf e.g. 
{{bf=recip(rord(creationDate),1,1000,1000)}}

If you have time and the motivation, it'd be great if you contribute a patch 
for this. We may already have parts of it from the existing patch.


was (Author: anshumg):
Hi Jens, Sorry but I haven't been able to get to this all this while.

Here's what we need to get working:
# Way to specify multiple values for a field within the local params. e.g.:
{code:title=SOLR-2798 would solve this}
http://localhost:8983/solr/techproducts/select?q={!mlt qf=foo qf=bar}docid
{code}
# We also need to support parameter dereferencing as you suggested, considering 
we don't want to get involved with commas:
{code}
http://localhost:8983/solr/techproducts/select?q={!mlt 
qf=$mlt.fl}docidmlt.fl=foomlt.fl=bar
{code}
Supporting comma's would interfere with the syntax used for things like bf e.g. 
{{bf=recip(rord(creationDate),1,1000,1000)}}

 MoreLikeThis Query Parser does not handle multiple field names
 --

 Key: SOLR-7143
 URL: https://issues.apache.org/jira/browse/SOLR-7143
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 5.0
Reporter: Jens Wille
Assignee: Anshum Gupta
 Attachments: SOLR-7143.patch, SOLR-7143.patch


 The newly introduced MoreLikeThis Query Parser (SOLR-6248) does not return 
 any results when supplied with multiple fields in the {{qf}} parameter.
 To reproduce within the techproducts example, compare:
 {code}
 curl 
 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name%7DMA147LL/A'
 curl 
 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=features%7DMA147LL/A'
 curl 
 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name,features%7DMA147LL/A'
 {code}
 The first two queries return 8 and 5 results, respectively. The third query 
 doesn't return any results (not even the matched document).
 In contrast, the MoreLikeThis Handler works as expected (accounting for the 
 default {{mintf}} and {{mindf}} values in SimpleMLTQParser):
 {code}
 curl 
 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=namemlt.mintf=1mlt.mindf=1'
 curl 
 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=featuresmlt.mintf=1mlt.mindf=1'
 curl 
 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=name,featuresmlt.mintf=1mlt.mindf=1'
 {code}
 After adding the following line to 
 {{example/techproducts/solr/techproducts/conf/solrconfig.xml}}:
 {code:language=XML}
 requestHandler name=/mlt class=solr.MoreLikeThisHandler /
 {code}
 The first two queries return 7 and 4 results, respectively (excluding the 
 matched document). The third query returns 7 results, as one would expect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.8.0_45) - Build # 4807 - Failure!

2015-05-13 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/4807/
Java: 64bit/jdk1.8.0_45 -XX:+UseCompressedOops -XX:+UseG1GC

1 tests failed.
FAILED:  org.apache.solr.core.TestArbitraryIndexDir.testLoadNewIndexDir

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at 
__randomizedtesting.SeedInfo.seed([E296220DB3BA94EB:BCC99352D230443]:0)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:794)
at 
org.apache.solr.core.TestArbitraryIndexDir.testLoadNewIndexDir(TestArbitraryIndexDir.java:128)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: REQUEST FAILED: xpath=*[count(//doc)=1]
xml response was: ?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint 
name=QTime0/int/lstresult name=response numFound=0 
start=0/result
/response

request was:q=id:2qt=standardstart=0rows=20version=2.2

[jira] [Updated] (SOLR-7275) Pluggable authorization module in Solr

2015-05-13 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-7275:
---
Attachment: SOLR-7275.patch

Accidentally added the SimpleSolrAuthorizationPlugin to the last patch. 
Removing it. Also added 2 more static file extensions to ignore for authz 
purpose.

 Pluggable authorization module in Solr
 --

 Key: SOLR-7275
 URL: https://issues.apache.org/jira/browse/SOLR-7275
 Project: Solr
  Issue Type: Sub-task
Reporter: Anshum Gupta
Assignee: Anshum Gupta
 Attachments: SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch


 Solr needs an interface that makes it easy for different authorization 
 systems to be plugged into it. Here's what I plan on doing:
 Define an interface {{SolrAuthorizationPlugin}} with one single method 
 {{isAuthorized}}. This would take in a {{SolrRequestContext}} object and 
 return an {{SolrAuthorizationResponse}} object. The object as of now would 
 only contain a single boolean value but in the future could contain more 
 information e.g. ACL for document filtering etc.
 The reason why we need a context object is so that the plugin doesn't need to 
 understand Solr's capabilities e.g. how to extract the name of the collection 
 or other information from the incoming request as there are multiple ways to 
 specify the target collection for a request. Similarly request type can be 
 specified by {{qt}} or {{/handler_name}}.
 Flow:
 Request - SolrDispatchFilter - isAuthorized(context) - Process/Return.
 {code}
 public interface SolrAuthorizationPlugin {
   public SolrAuthorizationResponse isAuthorized(SolrRequestContext context);
 }
 {code}
 {code}
 public  class SolrRequestContext {
   UserInfo; // Will contain user context from the authentication layer.
   HTTPRequest request;
   Enum OperationType; // Correlated with user roles.
   String[] CollectionsAccessed;
   String[] FieldsAccessed;
   String Resource;
 }
 {code}
 {code}
 public class SolrAuthorizationResponse {
   boolean authorized;
   public boolean isAuthorized();
 }
 {code}
 User Roles: 
 * Admin
 * Collection Level:
   * Query
   * Update
   * Admin
 Using this framework, an implementation could be written for specific 
 security systems e.g. Apache Ranger or Sentry. It would keep all the security 
 system specific code out of Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6371) Improve Spans payload collection

2015-05-13 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6371:
--
Attachment: LUCENE-6371.patch

Updated patch:
* collectLeaf() now takes PostingsEnum and Term
* the default impls are renamed to NO_OP

Changing from Collectionbyte[] to BytesRefArray is a great idea, but I'd like 
to do that in a separate issue as that effects the external SpanQuery API a 
fair amount.  This patch currently only changes internals.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Priority: Minor
 Attachments: LUCENE-6371.patch, LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7275) Pluggable authorization module in Solr

2015-05-13 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542521#comment-14542521
 ] 

Noble Paul commented on SOLR-7275:
--

We need to tackle the modification of security.json  pretty soon. But that can 
be dealt separately

The security.json needs to be watched and the plugin needs to be notified of 
the change. That should not prevent us from committing this


 Pluggable authorization module in Solr
 --

 Key: SOLR-7275
 URL: https://issues.apache.org/jira/browse/SOLR-7275
 Project: Solr
  Issue Type: Sub-task
Reporter: Anshum Gupta
Assignee: Anshum Gupta
 Attachments: SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch


 Solr needs an interface that makes it easy for different authorization 
 systems to be plugged into it. Here's what I plan on doing:
 Define an interface {{SolrAuthorizationPlugin}} with one single method 
 {{isAuthorized}}. This would take in a {{SolrRequestContext}} object and 
 return an {{SolrAuthorizationResponse}} object. The object as of now would 
 only contain a single boolean value but in the future could contain more 
 information e.g. ACL for document filtering etc.
 The reason why we need a context object is so that the plugin doesn't need to 
 understand Solr's capabilities e.g. how to extract the name of the collection 
 or other information from the incoming request as there are multiple ways to 
 specify the target collection for a request. Similarly request type can be 
 specified by {{qt}} or {{/handler_name}}.
 Flow:
 Request - SolrDispatchFilter - isAuthorized(context) - Process/Return.
 {code}
 public interface SolrAuthorizationPlugin {
   public SolrAuthorizationResponse isAuthorized(SolrRequestContext context);
 }
 {code}
 {code}
 public  class SolrRequestContext {
   UserInfo; // Will contain user context from the authentication layer.
   HTTPRequest request;
   Enum OperationType; // Correlated with user roles.
   String[] CollectionsAccessed;
   String[] FieldsAccessed;
   String Resource;
 }
 {code}
 {code}
 public class SolrAuthorizationResponse {
   boolean authorized;
   public boolean isAuthorized();
 }
 {code}
 User Roles: 
 * Admin
 * Collection Level:
   * Query
   * Update
   * Admin
 Using this framework, an implementation could be written for specific 
 security systems e.g. Apache Ranger or Sentry. It would keep all the security 
 system specific code out of Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7143) MoreLikeThis Query Parser does not handle multiple field names

2015-05-13 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542663#comment-14542663
 ] 

Anshum Gupta commented on SOLR-7143:


Hi Jens, Sorry but I haven't been able to get to this all this while.

Here's what we need to get working:
# Way to specify multiple values for a field within the local params. e.g.:
{code:title=SOLR-2798 would solve this}
http://localhost:8983/solr/techproducts/select?q={!mlt qf=foo qf=bar}docid
{code}
# We also need to support parameter dereferencing as you suggested, considering 
we don't want to get involved with commas:
{code}
http://localhost:8983/solr/techproducts/select?q={!mlt 
qf=$mlt.fl}docidmlt.fl=foomlt.fl=bar
{code}
Supporting comma's would interfere with the syntax used for things like bf e.g. 
{{bf=recip(rord(creationDate),1,1000,1000)}}

 MoreLikeThis Query Parser does not handle multiple field names
 --

 Key: SOLR-7143
 URL: https://issues.apache.org/jira/browse/SOLR-7143
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 5.0
Reporter: Jens Wille
Assignee: Anshum Gupta
 Attachments: SOLR-7143.patch, SOLR-7143.patch


 The newly introduced MoreLikeThis Query Parser (SOLR-6248) does not return 
 any results when supplied with multiple fields in the {{qf}} parameter.
 To reproduce within the techproducts example, compare:
 {code}
 curl 
 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name%7DMA147LL/A'
 curl 
 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=features%7DMA147LL/A'
 curl 
 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name,features%7DMA147LL/A'
 {code}
 The first two queries return 8 and 5 results, respectively. The third query 
 doesn't return any results (not even the matched document).
 In contrast, the MoreLikeThis Handler works as expected (accounting for the 
 default {{mintf}} and {{mindf}} values in SimpleMLTQParser):
 {code}
 curl 
 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=namemlt.mintf=1mlt.mindf=1'
 curl 
 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=featuresmlt.mintf=1mlt.mindf=1'
 curl 
 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=name,featuresmlt.mintf=1mlt.mindf=1'
 {code}
 After adding the following line to 
 {{example/techproducts/solr/techproducts/conf/solrconfig.xml}}:
 {code:language=XML}
 requestHandler name=/mlt class=solr.MoreLikeThisHandler /
 {code}
 The first two queries return 7 and 4 results, respectively (excluding the 
 matched document). The third query returns 7 results, as one would expect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-5.x-Linux (32bit/jdk1.8.0_45) - Build # 12491 - Failure!

2015-05-13 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/12491/
Java: 32bit/jdk1.8.0_45 -server -XX:+UseG1GC

1 tests failed.
FAILED:  org.apache.solr.cloud.CloudExitableDirectoryReaderTest.test

Error Message:
No live SolrServers available to handle this 
request:[https://127.0.0.1:33066/_/aa/collection1]

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this request:[https://127.0.0.1:33066/_/aa/collection1]
at 
__randomizedtesting.SeedInfo.seed([9951855D48D4B18A:1105BA87E628DC72]:0)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:355)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1086)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:856)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:799)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:943)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:958)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.queryServer(AbstractFullDistribZkTestBase.java:1425)
at 
org.apache.solr.cloud.CloudExitableDirectoryReaderTest.assertPartialResults(CloudExitableDirectoryReaderTest.java:102)
at 
org.apache.solr.cloud.CloudExitableDirectoryReaderTest.doTimeoutTests(CloudExitableDirectoryReaderTest.java:86)
at 
org.apache.solr.cloud.CloudExitableDirectoryReaderTest.test(CloudExitableDirectoryReaderTest.java:53)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 

[jira] [Commented] (SOLR-7468) Kerberos authentication module

2015-05-13 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542913#comment-14542913
 ] 

Anshum Gupta commented on SOLR-7468:


Here's some feedback:

# Can we avoid the addition of the extra Servlet Filter (KerberosFilter) ? Now 
that SDF is essentially a wrapper, perhaps we could reuse the wrapper.
# If we do #1, we also wouldn't need the change/hack in the JettySolrRunner. 
Also, no change would be needed in MiniSolrCloudCluster.
# Minor but important, I noticed a lot of unused imports, you should clean 
those up.

 Kerberos authentication module
 --

 Key: SOLR-7468
 URL: https://issues.apache.org/jira/browse/SOLR-7468
 Project: Solr
  Issue Type: New Feature
  Components: security
Reporter: Ishan Chattopadhyaya
 Attachments: SOLR-7468.patch, SOLR-7468.patch, SOLR-7468.patch


 SOLR-7274 introduces a pluggable authentication framework. This issue 
 provides a Kerberos plugin implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6319) Delegating OneMerge

2015-05-13 Thread Elliott Bradshaw (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542945#comment-14542945
 ] 

Elliott Bradshaw commented on LUCENE-6319:
--

Just thought I'd ping on this.  Any thoughts?  Greatest patch ever?  Seriously 
though, I know this touches a lot of hardcore internal classes, so I get it if 
people are wary.  If anyone has any suggestions of a different route, I'm more 
than happy to explore it.

 Delegating OneMerge
 ---

 Key: LUCENE-6319
 URL: https://issues.apache.org/jira/browse/LUCENE-6319
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Elliott Bradshaw
 Attachments: SOLR-6319.patch


 In trying to integrate SortingMergePolicy into ElasticSearch, I ran into an 
 issue where the custom merge logic was being stripped out by 
 IndexUpgraderMergeSpecification.  Related issue here:
 https://github.com/elasticsearch/elasticsearch/issues/9731
 In an endeavor to fix this, I attempted to create a DelegatingOneMerge that 
 could be used to chain the different MergePolicies together.  I quickly 
 discovered this to be impossible, due to the direct member variable access of 
 OneMerge by IndexWriter and other classes.  It would be great if this 
 variable access could be privatized and the consuming classes modified to use 
 the appropriate getters and setters.  Here's an example DelegatingOneMerge 
 and modified OneMerge.
 https://gist.github.com/ebradshaw/e0b74e9e8d4976ab9e0a
 https://gist.github.com/ebradshaw/d72116a014f226076303
 The downside here is that this would require an API change, as there are 
 three public variables in OneMerge: estimatedMergeBytes, segments and 
 totalDocCount.  These would have to be moved behind public getters.
 Without this change, I'm not sure how we could get the SortingMergePolicy 
 working in ES, but if anyone has any other suggestions I'm all ears!  Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6459) [suggest] Query Interface for suggest API

2015-05-13 Thread Areek Zillur (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Areek Zillur updated LUCENE-6459:
-
Attachment: LUCENE-6459.patch

Updated Patch:
 - don't allow zero length suggestion values
 - improve docs
 - added tests

 [suggest] Query Interface for suggest API
 -

 Key: LUCENE-6459
 URL: https://issues.apache.org/jira/browse/LUCENE-6459
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.1
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.x, 5.1

 Attachments: LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch, 
 LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch


 This patch factors out common indexing/search API used by the recently 
 introduced [NRTSuggester|https://issues.apache.org/jira/browse/LUCENE-6339]. 
 The motivation is to provide a query interface for FST-based fields 
 (*SuggestField* and *ContextSuggestField*) for enabling suggestion scoring 
 and more powerful automaton queries. 
 Previously, only prefix ‘queries’ with index-time weights were supported but 
 we can also support:
 * Prefix queries expressed as regular expressions:  get suggestions that 
 match multiple prefixes
   ** Example: _star\[wa\|tr\]_ matches _starwars_ and _startrek_
 * Fuzzy Prefix queries supporting scoring: get typo tolerant suggestions 
 scored by how close they are to the query prefix
 ** Example: querying for _seper_ will score _separate_ higher then 
 _superstitious_
 * Context Queries: get suggestions boosted and/or filtered based on their 
 indexed contexts (meta data)
 ** Example: get typo tolerant suggestions on song names with prefix _like 
 a roling_ boosting songs with genre _rock_ and _indie_
 ** Example: get suggestion on all file names starting with _finan_ only 
 for _user1_ and _user2_
 h3. Suggest API
 {code}
 SuggestIndexSearcher searcher = new SuggestIndexSearcher(reader);
 CompletionQuery query = ...
 TopSuggestDocs suggest = searcher.suggest(query, num);
 {code}
 h3. CompletionQuery
 *CompletionQuery* is used to query *SuggestField* and *ContextSuggestField*. 
 A *CompletionQuery* produces a *CompletionWeight*, which allows 
 *CompletionQuery* implementations to pass in an automaton that will be 
 intersected with a FST and allows boosting and meta data extraction from the 
 intersected partial paths. A *CompletionWeight* produces a 
 *CompletionScorer*. A *CompletionScorer* executes a Top N search against the 
 FST with the provided automaton, scoring and filtering all matched paths. 
 h4. PrefixCompletionQuery
 Return documents with values that match the prefix of an analyzed term text 
 Documents are sorted according to their suggest field weight. 
 {code}
 PrefixCompletionQuery(Analyzer analyzer, Term term)
 {code}
 h4. RegexCompletionQuery
 Return documents with values that match the prefix of a regular expression
 Documents are sorted according to their suggest field weight.
 {code}
 RegexCompletionQuery(Term term)
 {code}
 h4. FuzzyCompletionQuery
 Return documents with values that has prefixes within a specified edit 
 distance of an analyzed term text.
 Documents are ‘boosted’ by the number of matching prefix letters of the 
 suggestion with respect to the original term text.
 {code}
 FuzzyCompletionQuery(Analyzer analyzer, Term term)
 {code}
 h5. Scoring
 {{suggestion_weight + (global_maximum_weight * boost)}}
 where {{suggestion_weight}}, {{global_maximum_weight}} and {{boost}} are all 
 integers. 
 {{boost = # of prefix characters matched}}
 h4. ContextQuery
 Return documents that match a {{CompletionQuery}} filtered and/or boosted by 
 provided context(s). 
 {code}
 ContextQuery(CompletionQuery query)
 contextQuery.addContext(CharSequence context, int boost, boolean exact)
 {code}
 *NOTE:* {{ContextQuery}} should be used with {{ContextSuggestField}} to query 
 suggestions boosted and/or filtered by contexts
 h5. Scoring
 {{suggestion_weight + (global_maximum_weight * context_boost)}}
 where {{suggestion_weight}}, {{global_maximum_weight}} and {{context_boost}} 
 are all integers
 When used with {{FuzzyCompletionQuery}},
 {{suggestion_weight + (global_maximum_weight * (context_boost + 
 fuzzy_boost))}}
 h3. Context Suggest Field
 To use {{ContextQuery}}, use {{ContextSuggestField}} instead of 
 {{SuggestField}}. Any {{CompletionQuery}} can be used with 
 {{ContextSuggestField}}, the default behaviour is to return suggestions from 
 *all* contexts. {{Context}} for every completion hit can be accessed through 
 {{SuggestScoreDoc#context}}.
 {code}
 ContextSuggestField(String name, CollectionCharSequence contexts, String 
 value, int weight) 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SOLR-6273) Cross Data Center Replication

2015-05-13 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542953#comment-14542953
 ] 

Erick Erickson commented on SOLR-6273:
--

[~arcadius] Sorry it took a while to get back to you, but currently CDCR is 
active-passive, not active-active so the scenario you asked about shouldn't 
arise.

 Cross Data Center Replication
 -

 Key: SOLR-6273
 URL: https://issues.apache.org/jira/browse/SOLR-6273
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
Assignee: Erick Erickson
 Attachments: SOLR-6273-trunk.patch, SOLR-6273.patch, SOLR-6273.patch, 
 SOLR-6273.patch, SOLR-6273.patch


 This is the master issue for Cross Data Center Replication (CDCR)
 described at a high level here: 
 http://heliosearch.org/solr-cross-data-center-replication/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6480) Extend Simple GeoPointField Type to 3d

2015-05-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542165#comment-14542165
 ] 

Karl Wright commented on LUCENE-6480:
-

Depends on the application.  With 64 bits the resolution can be 6.07 meters, 
which seems probably good enough for most.


 Extend Simple GeoPointField Type to 3d 
 ---

 Key: LUCENE-6480
 URL: https://issues.apache.org/jira/browse/LUCENE-6480
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Nicholas Knize

 [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] proposes a 
 simple GeoPointField type to lucene core. This field uses 64bit encoding of 2 
 dimensional points to construct sorted term representations of GeoPoints 
 (aka: GeoHashing).
 This feature investigates adding support for encoding 3 dimensional 
 GeoPoints, either by extending GeoPointField to a Geo3DPointField or adding 
 an additional 3d constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7503) Recovery after ZK session expiration happens in a single thread for all cores in a node

2015-05-13 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter updated SOLR-7503:
-
Attachment: SOLR-7503.patch

Simple patch that registers cores in the background after ZK session 
expiration. I had to add some getter methods for the ExecutionService in the 
ZkContainer so that it is available to the ZkController when needed (iff cc is 
not null). I didn't want to use a new ExecutionService since the one setup by 
ZkContainer seemed most appropriate for this work, but you can't expose 
ZkContainer directly in ZkController because it's only a server-side thing.

 Recovery after ZK session expiration happens in a single thread for all cores 
 in a node
 ---

 Key: SOLR-7503
 URL: https://issues.apache.org/jira/browse/SOLR-7503
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.1
Reporter: Shalin Shekhar Mangar
Assignee: Timothy Potter
  Labels: impact-high
 Fix For: Trunk, 5.2

 Attachments: SOLR-7503.patch


 Currently cores are registered in parallel in an executor. However, when 
 there's a ZK expiration, the recovery, which also happens in the register 
 call, happens in a single thread:
 https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/cloud/ZkController.java#L300
 We should make these happen in parallel as well so that recovery after ZK 
 expiration doesn't take forever.
 Thanks to [~mewmewball] for catching this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (SOLR-6968) add hyperloglog in statscomponent as an approximate count

2015-05-13 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reopened SOLR-6968:


Doing some perf testing, i found that the SPARSE representation of HLL can 
cause some heinous response times for large sets -- a minor part of the issue 
seems to be the slower insertion rate compared to the FULL representation 
(documented), but a much bigger factor is that _merging_ multiple (large) 
SPARSE HLLs is almost 10x slower then merging FULL HLLs of the same size.

it might be worth adding tuning options and/or hueristics to control if/when 
SPARSE representation should be used (in cases where folks have smaller sets 
and care more about memory then speed), but for now i'm just going disable it.

 add hyperloglog in statscomponent as an approximate count
 -

 Key: SOLR-6968
 URL: https://issues.apache.org/jira/browse/SOLR-6968
 Project: Solr
  Issue Type: Sub-task
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: Trunk, 5.2

 Attachments: SOLR-6968.patch, SOLR-6968.patch, SOLR-6968.patch, 
 SOLR-6968.patch, SOLR-6968.patch, SOLR-6968.patch


 stats component currently supports calcDistinct but it's terribly 
 inefficient -- especially in distib mode.
 we should add support for using hyperloglog to compute an approximate count 
 of distinct values (using localparams via SOLR-6349 to control the precision 
 of the approximation)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7542) Schema API: Can't remove single dynamic copy field directive

2015-05-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542124#comment-14542124
 ] 

ASF subversion and git services commented on SOLR-7542:
---

Commit 1679229 from [~steve_rowe] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1679229 ]

SOLR-7542: Schema API: Can't remove single dynamic copy field directive (merged 
trunk r1679225)

 Schema API: Can't remove single dynamic copy field directive
 

 Key: SOLR-7542
 URL: https://issues.apache.org/jira/browse/SOLR-7542
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.1
Reporter: Steve Rowe
Assignee: Steve Rowe
 Fix For: 5.2

 Attachments: SOLR-7542.patch


 In a managed schema containing just a single dynamic copy field directive - 
 i.e. a glob source or destination - deleting the copy field directive fails.  
 For example, the default configset (data_driven_schema_configs) has such a 
 schema: the {{*}}-{{\_text\_}} copy field directive is the only one. 
 To reproduce:
 {noformat}
 bin/solr start -c
 bin/solr create my_solr_coll
 curl http://localhost:8983/solr/my_solr_coll/schema; 
 -d'{delete-copy-field:{source:*, dest:_text_}}'
 {noformat}
 The deletion fails, and an NPE is logged: 
 {noformat}
 ERROR - 2015-05-13 12:37:36.780; [my_solr_coll shard1 core_node1 
 my_solr_coll_shard1_replica1] org.apache.solr.common.SolrException; 
 null:java.lang.NullPointerException
 at 
 org.apache.solr.schema.IndexSchema.getCopyFieldProperties(IndexSchema.java:1450)
 at 
 org.apache.solr.schema.IndexSchema.getNamedPropertyValues(IndexSchema.java:1406)
 at org.apache.solr.schema.IndexSchema.persist(IndexSchema.java:390)
 at 
 org.apache.solr.schema.SchemaManager.doOperations(SchemaManager.java:120)
 at 
 org.apache.solr.schema.SchemaManager.performOperations(SchemaManager.java:94)
 at 
 org.apache.solr.handler.SchemaHandler.handleRequestBody(SchemaHandler.java:57)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
 [...]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-7542) Schema API: Can't remove single dynamic copy field directive

2015-05-13 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved SOLR-7542.
--
Resolution: Fixed

Committed to trunk and branch_5x.

 Schema API: Can't remove single dynamic copy field directive
 

 Key: SOLR-7542
 URL: https://issues.apache.org/jira/browse/SOLR-7542
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.1
Reporter: Steve Rowe
Assignee: Steve Rowe
 Fix For: 5.2

 Attachments: SOLR-7542.patch


 In a managed schema containing just a single dynamic copy field directive - 
 i.e. a glob source or destination - deleting the copy field directive fails.  
 For example, the default configset (data_driven_schema_configs) has such a 
 schema: the {{*}}-{{\_text\_}} copy field directive is the only one. 
 To reproduce:
 {noformat}
 bin/solr start -c
 bin/solr create my_solr_coll
 curl http://localhost:8983/solr/my_solr_coll/schema; 
 -d'{delete-copy-field:{source:*, dest:_text_}}'
 {noformat}
 The deletion fails, and an NPE is logged: 
 {noformat}
 ERROR - 2015-05-13 12:37:36.780; [my_solr_coll shard1 core_node1 
 my_solr_coll_shard1_replica1] org.apache.solr.common.SolrException; 
 null:java.lang.NullPointerException
 at 
 org.apache.solr.schema.IndexSchema.getCopyFieldProperties(IndexSchema.java:1450)
 at 
 org.apache.solr.schema.IndexSchema.getNamedPropertyValues(IndexSchema.java:1406)
 at org.apache.solr.schema.IndexSchema.persist(IndexSchema.java:390)
 at 
 org.apache.solr.schema.SchemaManager.doOperations(SchemaManager.java:120)
 at 
 org.apache.solr.schema.SchemaManager.performOperations(SchemaManager.java:94)
 at 
 org.apache.solr.handler.SchemaHandler.handleRequestBody(SchemaHandler.java:57)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
 [...]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7531) Config API is merging certain key names together

2015-05-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542085#comment-14542085
 ] 

ASF subversion and git services commented on SOLR-7531:
---

Commit 1679224 from [~noble.paul] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1679224 ]

SOLR-7531: config API shows a few keys merged together

 Config API is merging certain key names together
 

 Key: SOLR-7531
 URL: https://issues.apache.org/jira/browse/SOLR-7531
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.0, 5.1
Reporter: Shalin Shekhar Mangar
Assignee: Noble Paul
 Fix For: Trunk, 5.2


 Starting from a new Solr 5.0 install
 {code}
 ./bin/solr start -e schemaless
 curl 'http://localhost:8983/solr/gettingstarted/config'  config.json
 {code}
 Open config.json and note that there is a key called autoCommmitMaxDocs 
 under the updateHandler section.
 {code}
 curl 'http://localhost:8983/solr/gettingstarted/config' -H 
 'Content-type:application/json' -d '{set-property : 
 {updateHandler.autoCommit.maxDocs : 5000}}'
 curl 'http://localhost:8983/solr/gettingstarted/config'  config.json
 {code}
 Open config.json and note that both the value of updateHandler  autoCommit  
 maxDocs and updateHandler  autoCommitMaxDocs is now set to 5000



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6480) Extend Simple GeoPointField Type to 3d

2015-05-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542149#comment-14542149
 ] 

Karl Wright commented on LUCENE-6480:
-

bq. So why not just use ECEF then instead of the unit sphere?

I presume your question is about geo3D in general.
There's quite a bit of math in Geo3D that relies on GeoPoints being on the unit 
sphere.  For that reason, using the unit sphere, or projecting to it at least, 
is preferred.  If you are only doing containment of a point, you may not run 
into some of the more complex Geo3D math, but if you are determining 
relationships of bounding boxes to shapes, or finding the bounding box of a 
shape, you can't dodge being on the unit sphere.

bq. Or have you tried this and the few extra trig computations impaired 
performance?

If you mean trying to map points on the earth onto the unit sphere, then it was 
simply unnecessary for our application.  The maximum error you can get, as I 
stated before, by using a sphere rather than a real earth model is a few 
meters.  I maintain that doing such a mapping at indexing time is probably 
straightforward, at some performance expense, but I view this as beyond the 
bounds of this project.





 Extend Simple GeoPointField Type to 3d 
 ---

 Key: LUCENE-6480
 URL: https://issues.apache.org/jira/browse/LUCENE-6480
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Nicholas Knize

 [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] proposes a 
 simple GeoPointField type to lucene core. This field uses 64bit encoding of 2 
 dimensional points to construct sorted term representations of GeoPoints 
 (aka: GeoHashing).
 This feature investigates adding support for encoding 3 dimensional 
 GeoPoints, either by extending GeoPointField to a Geo3DPointField or adding 
 an additional 3d constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6968) add hyperloglog in statscomponent as an approximate count

2015-05-13 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-6968:
---
Attachment: SOLR-6968_nosparse.patch

patch that worked well in my perf tests to disable SPARSE (and optimize some 
ram usage when merging EMPTY) ... will commit once {{ant precommit test}} 
finishes.

 add hyperloglog in statscomponent as an approximate count
 -

 Key: SOLR-6968
 URL: https://issues.apache.org/jira/browse/SOLR-6968
 Project: Solr
  Issue Type: Sub-task
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: Trunk, 5.2

 Attachments: SOLR-6968.patch, SOLR-6968.patch, SOLR-6968.patch, 
 SOLR-6968.patch, SOLR-6968.patch, SOLR-6968.patch, SOLR-6968_nosparse.patch


 stats component currently supports calcDistinct but it's terribly 
 inefficient -- especially in distib mode.
 we should add support for using hyperloglog to compute an approximate count 
 of distinct values (using localparams via SOLR-6349 to control the precision 
 of the approximation)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6371) Improve Spans payload collection

2015-05-13 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6371:
--
Attachment: LUCENE-6371.patch

I've been playing around with various APIs for this, and I think this one works 
reasonably well.

Spans.isPayloadAvailable() and getPayload() are replaced with a collect() 
method that takes a SpanCollector.  If you want to get payloads from a Spans, 
you do the following:

{code:java}
PayloadSpanCollector collector = new PayloadSpanCollector();
while (spans.nextStartPosition() != NO_MORE_POSITIONS) {
  collector.reset();
  spans.collect(collector);
  doSomethingWith(collector.getPayloads());
}
{code}

The actual job of collecting information from postings lists is devolved to the 
collector itself (via SpanCollector.collectLeaf(), called from 
TermSpans.collect()).

The API is made slightly complicated by the need to buffer collected 
information in NearOrderedSpans, because the algorithm there moves child spans 
on eagerly when finding the smallest possible match, so by the time collect() 
is called we're out of position.  This is dealt with using a 
BufferedSpanCollector, with collectCandidate(Spans) and accept() methods.  The 
default (No-op) collector has a no-op implementation of this, which should get 
optimized away by HotSpot, meaning that we don't need to have separate 
implementations for collecting and non-collecting algorithms, and can do away 
with PayloadNearOrderedSpans.

This patch also moves the PayloadCheck queries to the .payloads package, which 
tidies things up a bit.

All tests pass.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Priority: Minor
 Attachments: LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms

2015-05-13 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-6481:
---
Attachment: LUCENE-6481.patch

The test had the lat and lon ordering incorrect for both GeoPointFieldType and 
the GeoPointInBBoxQuery. I've attached a new patch with the correction.  

testRandomTiny passes but there is one failure in testRandom with the following:
{noformat}
ant test -Dtestcase=TestGeoPointQuery -Dtestmethod=testRandom 
-Dtests.seed=F1E43F53709BFF82 -Dtests.verbose=true
{noformat}

{noformat}
   [junit4]   2 NOTE: reproduce with: ant test  -Dtestcase=TestGeoPointQuery 
-Dtests.method=testRandom -Dtests.seed=F1E43F53709BFF82 -Dtests.slow=true 
-Dtests.locale=en_US -Dtests.timezone=Africa/Lome -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
   [junit4] FAILURE 1.54s | TestGeoPointQuery.testRandom 
   [junit4] Throwable #1: java.lang.AssertionError: id=632 docID=613 
lat=46.19240875459866 lon=143.92476891121902 expected true but got: false 
deleted?=false
   [junit4]at 
__randomizedtesting.SeedInfo.seed([F1E43F53709BFF82:83A81A5CC1FB49F1]:0)
   [junit4]at 
org.apache.lucene.search.TestGeoPointQuery.verify(TestGeoPointQuery.java:302)
   [junit4]at 
org.apache.lucene.search.TestGeoPointQuery.doTestRandom(TestGeoPointQuery.java:204)
   [junit4]at 
org.apache.lucene.search.TestGeoPointQuery.testRandom(TestGeoPointQuery.java:130)
   [junit4]at java.lang.Thread.run(Thread.java:745)
{noformat}

This should be enough to debug the issue. I expect to have a new patch sometime 
tomorrow or before weeks end.

 Improve GeoPointField type to only visit high precision boundary terms 
 ---

 Key: LUCENE-6481
 URL: https://issues.apache.org/jira/browse/LUCENE-6481
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Nicholas Knize
 Attachments: LUCENE-6481.patch, LUCENE-6481.patch, 
 LUCENE-6481_WIP.patch


 Current GeoPointField [LUCENE-6450 | 
 https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges 
 along the space-filling curve that represent a provided bounding box.  This 
 determines which terms to visit in the terms dictionary and which to skip. 
 This is suboptimal for large bounding boxes as we may end up visiting all 
 terms (which could be quite large). 
 This incremental improvement is to improve GeoPointField to only visit high 
 precision terms in boundary ranges and use the postings list for ranges that 
 are completely within the target bounding box.
 A separate improvement is to switch over to auto-prefix and build an 
 Automaton representing the bounding box.  That can be tracked in a separate 
 issue.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 3114 - Failure

2015-05-13 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/3114/

1 tests failed.
REGRESSION:  
org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeMixedAdds

Error Message:
soft529 wasn't fast enough

Stack Trace:
java.lang.AssertionError: soft529 wasn't fast enough
at 
__randomizedtesting.SeedInfo.seed([24CCE5AE308765DB:75181C2E81F4557C]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertNotNull(Assert.java:526)
at 
org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeMixedAdds(SoftAutoCommitTest.java:111)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 9443 lines...]
   [junit4] Suite: org.apache.solr.update.SoftAutoCommitTest
   [junit4]   2 Creating dataDir: 

[jira] [Commented] (SOLR-7275) Pluggable authorization module in Solr

2015-05-13 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543224#comment-14543224
 ] 

Anshum Gupta commented on SOLR-7275:


Thanks for the feedback Noble.

Right, as of now, a node restart would be required for security.json to be 
re-read. I'll create another issue for that and as I understand, you don't have 
an objection to committing this, right? :-)

 Pluggable authorization module in Solr
 --

 Key: SOLR-7275
 URL: https://issues.apache.org/jira/browse/SOLR-7275
 Project: Solr
  Issue Type: Sub-task
Reporter: Anshum Gupta
Assignee: Anshum Gupta
 Attachments: SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch


 Solr needs an interface that makes it easy for different authorization 
 systems to be plugged into it. Here's what I plan on doing:
 Define an interface {{SolrAuthorizationPlugin}} with one single method 
 {{isAuthorized}}. This would take in a {{SolrRequestContext}} object and 
 return an {{SolrAuthorizationResponse}} object. The object as of now would 
 only contain a single boolean value but in the future could contain more 
 information e.g. ACL for document filtering etc.
 The reason why we need a context object is so that the plugin doesn't need to 
 understand Solr's capabilities e.g. how to extract the name of the collection 
 or other information from the incoming request as there are multiple ways to 
 specify the target collection for a request. Similarly request type can be 
 specified by {{qt}} or {{/handler_name}}.
 Flow:
 Request - SolrDispatchFilter - isAuthorized(context) - Process/Return.
 {code}
 public interface SolrAuthorizationPlugin {
   public SolrAuthorizationResponse isAuthorized(SolrRequestContext context);
 }
 {code}
 {code}
 public  class SolrRequestContext {
   UserInfo; // Will contain user context from the authentication layer.
   HTTPRequest request;
   Enum OperationType; // Correlated with user roles.
   String[] CollectionsAccessed;
   String[] FieldsAccessed;
   String Resource;
 }
 {code}
 {code}
 public class SolrAuthorizationResponse {
   boolean authorized;
   public boolean isAuthorized();
 }
 {code}
 User Roles: 
 * Admin
 * Collection Level:
   * Query
   * Update
   * Admin
 Using this framework, an implementation could be written for specific 
 security systems e.g. Apache Ranger or Sentry. It would keep all the security 
 system specific code out of Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-5.x-Windows (64bit/jdk1.7.0_80) - Build # 4687 - Failure!

2015-05-13 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Windows/4687/
Java: 64bit/jdk1.7.0_80 -XX:-UseCompressedOops -XX:+UseParallelGC

1 tests failed.
FAILED:  org.apache.solr.core.TestArbitraryIndexDir.testLoadNewIndexDir

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at 
__randomizedtesting.SeedInfo.seed([9DE047CFE9B60DA3:74BAFCF7772F9D0B]:0)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:794)
at 
org.apache.solr.core.TestArbitraryIndexDir.testLoadNewIndexDir(TestArbitraryIndexDir.java:128)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: REQUEST FAILED: xpath=*[count(//doc)=1]
xml response was: ?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint 
name=QTime0/int/lstresult name=response numFound=0 
start=0/result
/response

request 

[jira] [Commented] (SOLR-6968) add hyperloglog in statscomponent as an approximate count

2015-05-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542228#comment-14542228
 ] 

ASF subversion and git services commented on SOLR-6968:
---

Commit 1679241 from hoss...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1679241 ]

SOLR-6968: perf tweak: eliminate use of SPARSE storage option since it has some 
pathologically bad behavior for some set sizes (particularly when merging shard 
responses)

 add hyperloglog in statscomponent as an approximate count
 -

 Key: SOLR-6968
 URL: https://issues.apache.org/jira/browse/SOLR-6968
 Project: Solr
  Issue Type: Sub-task
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: Trunk, 5.2

 Attachments: SOLR-6968.patch, SOLR-6968.patch, SOLR-6968.patch, 
 SOLR-6968.patch, SOLR-6968.patch, SOLR-6968.patch, SOLR-6968_nosparse.patch


 stats component currently supports calcDistinct but it's terribly 
 inefficient -- especially in distib mode.
 we should add support for using hyperloglog to compute an approximate count 
 of distinct values (using localparams via SOLR-6349 to control the precision 
 of the approximation)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms

2015-05-13 Thread Nicholas Knize (JIRA)
Nicholas Knize created LUCENE-6481:
--

 Summary: Improve GeoPointField type to only visit high precision 
boundary terms 
 Key: LUCENE-6481
 URL: https://issues.apache.org/jira/browse/LUCENE-6481
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Nicholas Knize


Current GeoPointField [LUCENE-6450 | 
https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges 
along the space-filling curve that represent a provided bounding box.  This 
determines which terms to visit in the terms dictionary and which to skip. This 
is suboptimal for large bounding boxes as we may end up visiting all terms 
(which could be quite large). 

This incremental improvement is to improve GeoPointField to only visit high 
precision terms in boundary ranges and use the postings list for ranges that 
are completely within the target bounding box.

A separate improvement is to switch over to auto-prefix and build an Automaton 
representing the bounding box.  That can be tracked in a separate issue.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6968) add hyperloglog in statscomponent as an approximate count

2015-05-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542290#comment-14542290
 ] 

ASF subversion and git services commented on SOLR-6968:
---

Commit 1679250 from hoss...@apache.org in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1679250 ]

SOLR-6968: perf tweak: eliminate use of SPARSE storage option since it has some 
pathologically bad behavior for some set sizes (particularly when merging shard 
responses) (merge r1679241)

 add hyperloglog in statscomponent as an approximate count
 -

 Key: SOLR-6968
 URL: https://issues.apache.org/jira/browse/SOLR-6968
 Project: Solr
  Issue Type: Sub-task
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: Trunk, 5.2

 Attachments: SOLR-6968.patch, SOLR-6968.patch, SOLR-6968.patch, 
 SOLR-6968.patch, SOLR-6968.patch, SOLR-6968.patch, SOLR-6968_nosparse.patch


 stats component currently supports calcDistinct but it's terribly 
 inefficient -- especially in distib mode.
 we should add support for using hyperloglog to compute an approximate count 
 of distinct values (using localparams via SOLR-6349 to control the precision 
 of the approximation)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7542) Schema API: Can't remove single dynamic copy field directive

2015-05-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542087#comment-14542087
 ] 

ASF subversion and git services commented on SOLR-7542:
---

Commit 1679225 from [~steve_rowe] in branch 'dev/trunk'
[ https://svn.apache.org/r1679225 ]

SOLR-7542: Schema API: Can't remove single dynamic copy field directive

 Schema API: Can't remove single dynamic copy field directive
 

 Key: SOLR-7542
 URL: https://issues.apache.org/jira/browse/SOLR-7542
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.1
Reporter: Steve Rowe
Assignee: Steve Rowe
 Fix For: 5.2

 Attachments: SOLR-7542.patch


 In a managed schema containing just a single dynamic copy field directive - 
 i.e. a glob source or destination - deleting the copy field directive fails.  
 For example, the default configset (data_driven_schema_configs) has such a 
 schema: the {{*}}-{{\_text\_}} copy field directive is the only one. 
 To reproduce:
 {noformat}
 bin/solr start -c
 bin/solr create my_solr_coll
 curl http://localhost:8983/solr/my_solr_coll/schema; 
 -d'{delete-copy-field:{source:*, dest:_text_}}'
 {noformat}
 The deletion fails, and an NPE is logged: 
 {noformat}
 ERROR - 2015-05-13 12:37:36.780; [my_solr_coll shard1 core_node1 
 my_solr_coll_shard1_replica1] org.apache.solr.common.SolrException; 
 null:java.lang.NullPointerException
 at 
 org.apache.solr.schema.IndexSchema.getCopyFieldProperties(IndexSchema.java:1450)
 at 
 org.apache.solr.schema.IndexSchema.getNamedPropertyValues(IndexSchema.java:1406)
 at org.apache.solr.schema.IndexSchema.persist(IndexSchema.java:390)
 at 
 org.apache.solr.schema.SchemaManager.doOperations(SchemaManager.java:120)
 at 
 org.apache.solr.schema.SchemaManager.performOperations(SchemaManager.java:94)
 at 
 org.apache.solr.handler.SchemaHandler.handleRequestBody(SchemaHandler.java:57)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
 [...]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-6968) add hyperloglog in statscomponent as an approximate count

2015-05-13 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-6968.

Resolution: Fixed

 add hyperloglog in statscomponent as an approximate count
 -

 Key: SOLR-6968
 URL: https://issues.apache.org/jira/browse/SOLR-6968
 Project: Solr
  Issue Type: Sub-task
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: Trunk, 5.2

 Attachments: SOLR-6968.patch, SOLR-6968.patch, SOLR-6968.patch, 
 SOLR-6968.patch, SOLR-6968.patch, SOLR-6968.patch, SOLR-6968_nosparse.patch


 stats component currently supports calcDistinct but it's terribly 
 inefficient -- especially in distib mode.
 we should add support for using hyperloglog to compute an approximate count 
 of distinct values (using localparams via SOLR-6349 to control the precision 
 of the approximation)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6450) Add simple encoded GeoPointField type to core

2015-05-13 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-6450:
---
Attachment: LUCENE-6450.patch

I've decided to go ahead and just add a minor updated patch for commit 
consideration and save performance improvements for new issues like 
[LUCENE-6481 | https://issues.apache.org/jira/browse/LUCENE-6481].  This 
enables other patches, like the BKD-tree [LUCENE-6477 | 
https://issues.apache.org/jira/browse/LUCENE-6477] to use the helper classes 
provided by this patch, and other contributors to iterate improvements on this 
new field type. 

Patch includes the following updates:

* Changed GeoPointIn*Query to subclass MultiTermQuery instead of 
NumericRangeQuery leaving NRQ unchanged
* Removed unused DocValues from GeoPointField.FieldType (reduces sized of index 
for now)
* Updated javadocs to reflect issues with large queries.
* 2 space indent formatting

Benchmarks are roughly the same, with a moderately reduced index size.

*GeoPointField*

Index Time:  160.545 sec
Index Size: 1.3G
Mean Query Time:  0.104 sec

 Add simple encoded GeoPointField type to core
 -

 Key: LUCENE-6450
 URL: https://issues.apache.org/jira/browse/LUCENE-6450
 Project: Lucene - Core
  Issue Type: New Feature
Affects Versions: Trunk, 5.x
Reporter: Nicholas Knize
Priority: Minor
 Attachments: LUCENE-6450-5x.patch, LUCENE-6450-TRUNK.patch, 
 LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch, 
 LUCENE-6450.patch


 At the moment all spatial capabilities, including basic point based indexing 
 and querying, require the lucene-spatial module. The spatial module, designed 
 to handle all things geo, requires dependency overhead (s4j, jts) to provide 
 spatial rigor for even the most simplistic spatial search use-cases (e.g., 
 lat/lon bounding box, point in poly, distance search). This feature trims the 
 overhead by adding a new GeoPointField type to core along with 
 GeoBoundingBoxQuery and GeoPolygonQuery classes to the .search package. This 
 field is intended as a straightforward lightweight type for the most basic 
 geo point use-cases without the overhead. 
 The field uses simple bit twiddling operations (currently morton hashing) to 
 encode lat/lon into a single long term.  The queries leverage simple 
 multi-phase filtering that starts by leveraging NumericRangeQuery to reduce 
 candidate terms deferring the more expensive mathematics to the smaller 
 candidate sets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7537) Could not find or load main class org.apache.solr.util.SimplePostTool

2015-05-13 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542374#comment-14542374
 ] 

Erik Hatcher commented on SOLR-7537:


This works for me on Solr 5.1.0, doing `bin/solr create -c gettingstarted` and 
cd'ing into bin (an atypical thing to do, not what the quick start says to do, 
by the way) and running your command on the built-in docs directory of the 
install, `sh post -c gettingstarted ../docs`.   Is your 
dist/solr-core-5.1.0.jar there?   Something seems broken in your environment.

 Could not find or load main class org.apache.solr.util.SimplePostTool
 -

 Key: SOLR-7537
 URL: https://issues.apache.org/jira/browse/SOLR-7537
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 5.1
 Environment: Windows 8.1, cygwin4.3.33
Reporter: Peng Li

 In solr-5.1.0/bin folder, I typed below command ../doc folder has 
 readme.docx
 sh post -c gettingstarted ../doc
 And I got below exception:
 c:\Java\jdk1.8.0_20/bin/java -classpath 
 /cygdrive/c/Users/lipeng/_Main/Servers/solr-5.1.0/dist/solr-core-5.1.0.jar 
 -Dauto=yes -Dc=gettingstarted -Ddata=files -Drecursive=yes 
 org.apache.solr.util.SimplePostTool ../doc
 Error: Could not find or load main class org.apache.solr.util.SimplePostTool
 I followed instruction from here: 
 http://lucene.apache.org/solr/quickstart.html
 Can you help me to take a look at? Thank you!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7275) Pluggable authorization module in Solr

2015-05-13 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-7275:
---
Attachment: SOLR-7275.patch

Patch that adds request type info for /select [READ] and /update [WRITE] 
requests.

 Pluggable authorization module in Solr
 --

 Key: SOLR-7275
 URL: https://issues.apache.org/jira/browse/SOLR-7275
 Project: Solr
  Issue Type: Sub-task
Reporter: Anshum Gupta
Assignee: Anshum Gupta
 Attachments: SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch


 Solr needs an interface that makes it easy for different authorization 
 systems to be plugged into it. Here's what I plan on doing:
 Define an interface {{SolrAuthorizationPlugin}} with one single method 
 {{isAuthorized}}. This would take in a {{SolrRequestContext}} object and 
 return an {{SolrAuthorizationResponse}} object. The object as of now would 
 only contain a single boolean value but in the future could contain more 
 information e.g. ACL for document filtering etc.
 The reason why we need a context object is so that the plugin doesn't need to 
 understand Solr's capabilities e.g. how to extract the name of the collection 
 or other information from the incoming request as there are multiple ways to 
 specify the target collection for a request. Similarly request type can be 
 specified by {{qt}} or {{/handler_name}}.
 Flow:
 Request - SolrDispatchFilter - isAuthorized(context) - Process/Return.
 {code}
 public interface SolrAuthorizationPlugin {
   public SolrAuthorizationResponse isAuthorized(SolrRequestContext context);
 }
 {code}
 {code}
 public  class SolrRequestContext {
   UserInfo; // Will contain user context from the authentication layer.
   HTTPRequest request;
   Enum OperationType; // Correlated with user roles.
   String[] CollectionsAccessed;
   String[] FieldsAccessed;
   String Resource;
 }
 {code}
 {code}
 public class SolrAuthorizationResponse {
   boolean authorized;
   public boolean isAuthorized();
 }
 {code}
 User Roles: 
 * Admin
 * Collection Level:
   * Query
   * Update
   * Admin
 Using this framework, an implementation could be written for specific 
 security systems e.g. Apache Ranger or Sentry. It would keep all the security 
 system specific code out of Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Why morphlines code is in Solr?

2015-05-13 Thread Noble Paul
https://github.com/kite-sdk/kite/tree/master/kite-morphlines/kite-morphlines-solr-core/src/main/java/org/kitesdk/morphline/solr


and here is the Solr codebase

https://github.com/apache/lucene-solr/tree/trunk/solr/contrib/morphlines-core/src/java/org/apache/solr/morphlines/solr

essentially the same files are there in both codebase. Ideally, the source
should be maintained in one project and the jar should be referred from the
other project

--Noble


On Wed, May 13, 2015 at 3:29 AM, Shawn Heisey apa...@elyograg.org wrote:

 On 5/12/2015 2:14 PM, Noble Paul wrote:
  When I said jar dependency , I did not mean , that we check in the jar
 
  we use httpclient, but if you checkout lucene trunk you don't get the
  httpclient jar ,but the build process will add it to the distribution

 Doesn't that describe what happens with morphlines?  The build process
 adds it to the distribution.

 If I'm completely missing the point of what you're saying, I'll shut up
 and let you elaborate and discuss it with other people who know what's
 going on.

 Thanks,
 Shawn


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
-
Noble Paul


RE: Why morphlines code is in Solr?

2015-05-13 Thread Uwe Schindler
Hi,

I think his question was why the morphlines contrib *source code* is in Solr at 
all. He argues that we could simply fetch the pre-built contrib module from 
Maven and not have a fork of the whole module in Solr.
Indeed I also don't like it that there are 2 almost similar variants of the 
morphlines contrib...

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Shawn Heisey [mailto:apa...@elyograg.org]
 Sent: Tuesday, May 12, 2015 11:59 PM
 To: dev@lucene.apache.org
 Subject: Re: Why morphlines code is in Solr?
 
 On 5/12/2015 2:14 PM, Noble Paul wrote:
  When I said jar dependency , I did not mean , that we check in the jar
 
  we use httpclient, but if you checkout lucene trunk you don't get the
  httpclient jar ,but the build process will add it to the distribution
 
 Doesn't that describe what happens with morphlines?  The build process
 adds it to the distribution.
 
 If I'm completely missing the point of what you're saying, I'll shut up and 
 let
 you elaborate and discuss it with other people who know what's going on.
 
 Thanks,
 Shawn
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7275) Pluggable authorization module in Solr

2015-05-13 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-7275:
---
Attachment: SOLR-7275.patch

This patch filters out authz and context creation for *.png and *.html requests.
There were a lot of those coming in for the new Admin UI.

 Pluggable authorization module in Solr
 --

 Key: SOLR-7275
 URL: https://issues.apache.org/jira/browse/SOLR-7275
 Project: Solr
  Issue Type: Sub-task
Reporter: Anshum Gupta
Assignee: Anshum Gupta
 Attachments: SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, 
 SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch, SOLR-7275.patch


 Solr needs an interface that makes it easy for different authorization 
 systems to be plugged into it. Here's what I plan on doing:
 Define an interface {{SolrAuthorizationPlugin}} with one single method 
 {{isAuthorized}}. This would take in a {{SolrRequestContext}} object and 
 return an {{SolrAuthorizationResponse}} object. The object as of now would 
 only contain a single boolean value but in the future could contain more 
 information e.g. ACL for document filtering etc.
 The reason why we need a context object is so that the plugin doesn't need to 
 understand Solr's capabilities e.g. how to extract the name of the collection 
 or other information from the incoming request as there are multiple ways to 
 specify the target collection for a request. Similarly request type can be 
 specified by {{qt}} or {{/handler_name}}.
 Flow:
 Request - SolrDispatchFilter - isAuthorized(context) - Process/Return.
 {code}
 public interface SolrAuthorizationPlugin {
   public SolrAuthorizationResponse isAuthorized(SolrRequestContext context);
 }
 {code}
 {code}
 public  class SolrRequestContext {
   UserInfo; // Will contain user context from the authentication layer.
   HTTPRequest request;
   Enum OperationType; // Correlated with user roles.
   String[] CollectionsAccessed;
   String[] FieldsAccessed;
   String Resource;
 }
 {code}
 {code}
 public class SolrAuthorizationResponse {
   boolean authorized;
   public boolean isAuthorized();
 }
 {code}
 User Roles: 
 * Admin
 * Collection Level:
   * Query
   * Update
   * Admin
 Using this framework, an implementation could be written for specific 
 security systems e.g. Apache Ranger or Sentry. It would keep all the security 
 system specific code out of Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6450) Add simple encoded GeoPointField type to core

2015-05-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541578#comment-14541578
 ] 

Karl Wright commented on LUCENE-6450:
-

I have some ideas for a geohash given (x,y,z) values that may turn out to be of 
interest.  This geohash would have acceptable precision (a few meters) when 
packed in a long (64 bits).  Question: does lucene efficiently support field 
types of that length?


 Add simple encoded GeoPointField type to core
 -

 Key: LUCENE-6450
 URL: https://issues.apache.org/jira/browse/LUCENE-6450
 Project: Lucene - Core
  Issue Type: New Feature
Affects Versions: Trunk, 5.x
Reporter: Nicholas Knize
Priority: Minor
 Attachments: LUCENE-6450-5x.patch, LUCENE-6450-TRUNK.patch, 
 LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch


 At the moment all spatial capabilities, including basic point based indexing 
 and querying, require the lucene-spatial module. The spatial module, designed 
 to handle all things geo, requires dependency overhead (s4j, jts) to provide 
 spatial rigor for even the most simplistic spatial search use-cases (e.g., 
 lat/lon bounding box, point in poly, distance search). This feature trims the 
 overhead by adding a new GeoPointField type to core along with 
 GeoBoundingBoxQuery and GeoPolygonQuery classes to the .search package. This 
 field is intended as a straightforward lightweight type for the most basic 
 geo point use-cases without the overhead. 
 The field uses simple bit twiddling operations (currently morton hashing) to 
 encode lat/lon into a single long term.  The queries leverage simple 
 multi-phase filtering that starts by leveraging NumericRangeQuery to reduce 
 candidate terms deferring the more expensive mathematics to the smaller 
 candidate sets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6450) Add simple encoded GeoPointField type to core

2015-05-13 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-6450:
---
Attachment: LUCENE-6450.patch

Updated patch to make Query fields final.

 Add simple encoded GeoPointField type to core
 -

 Key: LUCENE-6450
 URL: https://issues.apache.org/jira/browse/LUCENE-6450
 Project: Lucene - Core
  Issue Type: New Feature
Affects Versions: Trunk, 5.x
Reporter: Nicholas Knize
Priority: Minor
 Attachments: LUCENE-6450-5x.patch, LUCENE-6450-TRUNK.patch, 
 LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch, LUCENE-6450.patch, 
 LUCENE-6450.patch, LUCENE-6450.patch


 At the moment all spatial capabilities, including basic point based indexing 
 and querying, require the lucene-spatial module. The spatial module, designed 
 to handle all things geo, requires dependency overhead (s4j, jts) to provide 
 spatial rigor for even the most simplistic spatial search use-cases (e.g., 
 lat/lon bounding box, point in poly, distance search). This feature trims the 
 overhead by adding a new GeoPointField type to core along with 
 GeoBoundingBoxQuery and GeoPolygonQuery classes to the .search package. This 
 field is intended as a straightforward lightweight type for the most basic 
 geo point use-cases without the overhead. 
 The field uses simple bit twiddling operations (currently morton hashing) to 
 encode lat/lon into a single long term.  The queries leverage simple 
 multi-phase filtering that starts by leveraging NumericRangeQuery to reduce 
 candidate terms deferring the more expensive mathematics to the smaller 
 candidate sets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms

2015-05-13 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-6481:
---
Attachment: LUCENE-6481.patch

New patch, starting from [~nknize]'s and then folding in the evilish random 
test I added for LUCENE-6477 ... maybe this can help debug why there are false 
negatives?

E.g. with this patch when I run:

{noformat}
ant test -Dtestcase=TestGeoPointQuery -Dtestmethod=testRandomTiny 
-Dtests.seed=F1E43F53709BFF82 -Dtests.verbose=true
{noformat}

It fails with this:
{noformat}
   [junit4]   2 NOTE: reproduce with: ant test  -Dtestcase=TestGeoPointQuery 
-Dtests.method=testRandomTiny -Dtests.seed=F1E43F53709BFF82 
-Dtests.locale=en_US -Dtests.timezone=Africa/Lome -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
   [junit4] FAILURE 2.91s | TestGeoPointQuery.testRandomTiny 
   [junit4] Throwable #1: java.lang.AssertionError: id=0 docID=0 
lat=-27.18027939545 lon=-167.14191331870592 expected true but got: false 
deleted?=false
   [junit4]at 
__randomizedtesting.SeedInfo.seed([F1E43F53709BFF82:B8A3E1152EBAC72E]:0)
   [junit4]at 
org.apache.lucene.search.TestGeoPointQuery.verify(TestGeoPointQuery.java:301)
   [junit4]at 
org.apache.lucene.search.TestGeoPointQuery.doTestRandom(TestGeoPointQuery.java:203)
   [junit4]at 
org.apache.lucene.search.TestGeoPointQuery.testRandomTiny(TestGeoPointQuery.java:125)
   [junit4]at java.lang.Thread.run(Thread.java:745)
{noformat}

The test case should be easy-ish to debug: it only indexes at most a few 10s of 
points...


 Improve GeoPointField type to only visit high precision boundary terms 
 ---

 Key: LUCENE-6481
 URL: https://issues.apache.org/jira/browse/LUCENE-6481
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Nicholas Knize
 Attachments: LUCENE-6481.patch, LUCENE-6481_WIP.patch


 Current GeoPointField [LUCENE-6450 | 
 https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges 
 along the space-filling curve that represent a provided bounding box.  This 
 determines which terms to visit in the terms dictionary and which to skip. 
 This is suboptimal for large bounding boxes as we may end up visiting all 
 terms (which could be quite large). 
 This incremental improvement is to improve GeoPointField to only visit high 
 precision terms in boundary ranges and use the postings list for ranges that 
 are completely within the target bounding box.
 A separate improvement is to switch over to auto-prefix and build an 
 Automaton representing the bounding box.  That can be tracked in a separate 
 issue.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org