[jira] [Commented] (SOLR-1632) Distributed IDF

2013-02-20 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582022#comment-13582022
 ] 

Markus Jelsma commented on SOLR-1632:
-

No, not yet. Please let me do some real tests, there must be issues, the patch 
is over a year old! :)

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Fix For: 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, 
 distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4474) The collection status API allows to get a comprehensive status information of one collection.

2013-02-20 Thread milesli (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

milesli updated SOLR-4474:
--

Attachment: (was: CollectionsHandler.patch)

 The collection status API allows to get a comprehensive status information of 
 one collection.
 -

 Key: SOLR-4474
 URL: https://issues.apache.org/jira/browse/SOLR-4474
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.1
Reporter: milesli
 Attachments: CollectionParams.patch, CollectionsHandler.patch


 api : 
 http://ip:port/solr/admin/collections?action=statuscollection=collection1
 result: 
 {responseHeader:{status:0,QTime:3812},
 collection:{collection1:{
 index:{leadSizeInBytes:65,leadSize:0.0634765625kb},
 docs:{numDocs:0,maxDoc:0},
 shards:{
 shard1:[collection1,{
 routing{
 shard:shard1,
 roles:null,
 state:active,
 core:collection1,
 collection:collection1,
 node_name:10.224.202.81:8080_solr,
 base_url:http://10.224.202.81:8080/solr;,
 leader:true},
 index{
 numDocs:0,
 maxDoc:0,
 version:1,
 segmentCount:0,
 current:true,
 hasDeletions:false,
 
 directory:org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(org.apache.lucene.store.SimpleFSDirectory@E:\\workspace\\ws_red5\\csp\\example\\solr\\collection1\\data\\index
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@14cdfcf; 
 maxCacheMB=48.0 
 maxMergeSizeMB=4.0),userData:{},sizeInBytes:1271,size:1.24 KB}
 }
 ]
}
  }
 }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer

2013-02-20 Thread Raintung Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582029#comment-13582029
 ] 

Raintung Li commented on SOLR-4449:
---

Hi Philip, I have one suggestion that don't open new thread in 
BackupRequestLBHttpSolrServer that will increase threads double special in the 
search function.
Search receive request thread will wait the response from multiple shards, this 
thread can submit the second request to shards of overtime.
For example:
One search request:
3 shards, 
the first request: Need 1+3*2 = 7 threads to handle , 
the second request: Bad case(3 shards all need resend again) 10 threads

Change to 
the first request: Need 1+3=4  threads to handle
the second request: Bad case 7 threads

Look solr use simple random to do LB. 
https://blog.heroku.com/archives/2013/2/16/routing_performance_update/ 
This blog attacked the random strategy was terrible.

 

 Enable backup requests for the internal solr load balancer
 --

 Key: SOLR-4449
 URL: https://issues.apache.org/jira/browse/SOLR-4449
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: philip hoy
Priority: Minor
 Attachments: SOLR-4449.patch


 Add the ability to configure the built-in solr load balancer such that it 
 submits a backup request to the next server in the list if the initial 
 request takes too long. Employing such an algorithm could improve the latency 
 of the 9xth percentile albeit at the expense of increasing overall load due 
 to additional requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4474) The collection status API allows to get a comprehensive status information of one collection.

2013-02-20 Thread milesli (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

milesli updated SOLR-4474:
--

Attachment: CollectionsHandler.patch

 The collection status API allows to get a comprehensive status information of 
 one collection.
 -

 Key: SOLR-4474
 URL: https://issues.apache.org/jira/browse/SOLR-4474
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.1
Reporter: milesli
 Attachments: CollectionParams.patch, CollectionsHandler.patch


 api : 
 http://ip:port/solr/admin/collections?action=statuscollection=collection1
 result: 
 {responseHeader:{status:0,QTime:3812},
 collection:{collection1:{
 index:{leadSizeInBytes:65,leadSize:0.0634765625kb},
 docs:{numDocs:0,maxDoc:0},
 shards:{
 shard1:[collection1,{
 routing{
 shard:shard1,
 roles:null,
 state:active,
 core:collection1,
 collection:collection1,
 node_name:10.224.202.81:8080_solr,
 base_url:http://10.224.202.81:8080/solr;,
 leader:true},
 index{
 numDocs:0,
 maxDoc:0,
 version:1,
 segmentCount:0,
 current:true,
 hasDeletions:false,
 
 directory:org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(org.apache.lucene.store.SimpleFSDirectory@E:\\workspace\\ws_red5\\csp\\example\\solr\\collection1\\data\\index
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@14cdfcf; 
 maxCacheMB=48.0 
 maxMergeSizeMB=4.0),userData:{},sizeInBytes:1271,size:1.24 KB}
 }
 ]
}
  }
 }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-02-20 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582030#comment-13582030
 ] 

Markus Jelsma commented on SOLR-4260:
-

Nothing peculiar in the logs WARN logs. We don't log INFO usually unless 
something is really broken, that's too much data.


 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.0
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4474) The collection status API allows to get a comprehensive status information of one collection.

2013-02-20 Thread milesli (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

milesli updated SOLR-4474:
--

Description: 
api : http://ip:port/solr/admin/collections?action=statuscollection=collection1

result: 

{responseHeader:{status:0,QTime:3812},

collection:{collection1:{
index:{leadSizeInBytes:1271,leadSize:1.24 KB},
docs:{numDocs:0,maxDoc:0},
shards:{
shard1:[collection1,{
routing{
shard:shard1,
roles:null,
state:active,
core:collection1,
collection:collection1,
node_name:10.224.202.81:8080_solr,
base_url:http://10.224.202.81:8080/solr;,
leader:true},
index{
numDocs:0,
maxDoc:0,
version:1,
segmentCount:0,
current:true,
hasDeletions:false,

directory:org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(org.apache.lucene.store.SimpleFSDirectory@E:\\workspace\\ws_red5\\csp\\example\\solr\\collection1\\data\\index
 lockFactory=org.apache.lucene.store.NativeFSLockFactory@14cdfcf; 
maxCacheMB=48.0 
maxMergeSizeMB=4.0),userData:{},sizeInBytes:1271,size:1.24 KB}
}

]
   }
 }
}
}

  was:
api : http://ip:port/solr/admin/collections?action=statuscollection=collection1

result: 

{responseHeader:{status:0,QTime:3812},

collection:{collection1:{
index:{leadSizeInBytes:65,leadSize:0.0634765625kb},
docs:{numDocs:0,maxDoc:0},
shards:{
shard1:[collection1,{
routing{
shard:shard1,
roles:null,
state:active,
core:collection1,
collection:collection1,
node_name:10.224.202.81:8080_solr,
base_url:http://10.224.202.81:8080/solr;,
leader:true},
index{
numDocs:0,
maxDoc:0,
version:1,
segmentCount:0,
current:true,
hasDeletions:false,

directory:org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(org.apache.lucene.store.SimpleFSDirectory@E:\\workspace\\ws_red5\\csp\\example\\solr\\collection1\\data\\index
 lockFactory=org.apache.lucene.store.NativeFSLockFactory@14cdfcf; 
maxCacheMB=48.0 
maxMergeSizeMB=4.0),userData:{},sizeInBytes:1271,size:1.24 KB}
}

]
   }
 }
}
}


 The collection status API allows to get a comprehensive status information of 
 one collection.
 -

 Key: SOLR-4474
 URL: https://issues.apache.org/jira/browse/SOLR-4474
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.1
Reporter: milesli
 Attachments: CollectionParams.patch, CollectionsHandler.patch


 api : 
 http://ip:port/solr/admin/collections?action=statuscollection=collection1
 result: 
 {responseHeader:{status:0,QTime:3812},
 collection:{collection1:{
 index:{leadSizeInBytes:1271,leadSize:1.24 KB},
 docs:{numDocs:0,maxDoc:0},
 shards:{
 shard1:[collection1,{
 routing{
 shard:shard1,
 roles:null,
 state:active,
 core:collection1,
 collection:collection1,
 node_name:10.224.202.81:8080_solr,
 base_url:http://10.224.202.81:8080/solr;,
 leader:true},
 index{
 numDocs:0,
 maxDoc:0,
 version:1,
 segmentCount:0,
 current:true,
 hasDeletions:false,
 
 directory:org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(org.apache.lucene.store.SimpleFSDirectory@E:\\workspace\\ws_red5\\csp\\example\\solr\\collection1\\data\\index
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@14cdfcf; 
 maxCacheMB=48.0 
 maxMergeSizeMB=4.0),userData:{},sizeInBytes:1271,size:1.24 KB}
 }
 ]
}
  }
 }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more 

[jira] [Comment Edited] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-02-20 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582030#comment-13582030
 ] 

Markus Jelsma edited comment on SOLR-4260 at 2/20/13 8:38 AM:
--

Nothing peculiar in the WARN logs. We don't log INFO usually unless something 
is really broken, that's too much data.


  was (Author: markus17):
Nothing peculiar in the logs WARN logs. We don't log INFO usually unless 
something is really broken, that's too much data.

  
 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.0
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4765) Multi-valued docvalues field

2013-02-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-4765.
-

   Resolution: Fixed
Fix Version/s: 5.0
   4.2

 Multi-valued docvalues field
 

 Key: LUCENE-4765
 URL: https://issues.apache.org/jira/browse/LUCENE-4765
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Robert Muir
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4765.patch, LUCENE-4765.patch


 The general idea is basically the docvalues parallel to 
 FieldCache.getDocTermOrds/UninvertedField
 Currently this stuff is used in e.g. grouping and join for multivalued 
 fields, and in solr for faceting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4781) Backport classification module to branch_4x

2013-02-20 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-4781:


Attachment: LUCENE-4781.patch

attaching first patch with all the related trunk commits merged via svn merge.

 Backport classification module to branch_4x
 ---

 Key: LUCENE-4781
 URL: https://issues.apache.org/jira/browse/LUCENE-4781
 Project: Lucene - Core
  Issue Type: Task
  Components: modules/classification
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 4.2

 Attachments: LUCENE-4781.patch


 Backport lucene/classification from trunk to branch_4x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4781) Backport classification module to branch_4x

2013-02-20 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili resolved LUCENE-4781.
-

Resolution: Fixed

 Backport classification module to branch_4x
 ---

 Key: LUCENE-4781
 URL: https://issues.apache.org/jira/browse/LUCENE-4781
 Project: Lucene - Core
  Issue Type: Task
  Components: modules/classification
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 4.2

 Attachments: LUCENE-4781.patch


 Backport lucene/classification from trunk to branch_4x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3918) Port index sorter to trunk APIs

2013-02-20 Thread Anat (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anat updated LUCENE-3918:
-

Attachment: LUCENE-3918.patch

Added an updated version of the patch containing changes relating to the 
comments. 

The major changes are: 
Tests now extend LuceneTestCase
ListInteger was replaced with a growable int array. 
Instead of using a java Map object we are now encoding the positions 
information in a DataOutput stream.  

 Port index sorter to trunk APIs
 ---

 Key: LUCENE-3918
 URL: https://issues.apache.org/jira/browse/LUCENE-3918
 Project: Lucene - Core
  Issue Type: Task
  Components: modules/other
Affects Versions: 4.0-ALPHA
Reporter: Robert Muir
 Fix For: 4.2, 5.0

 Attachments: LUCENE-3918.patch, LUCENE-3918.patch


 LUCENE-2482 added an IndexSorter to 3.x, but we need to port this
 functionality to 4.0 apis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-02-20 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582142#comment-13582142
 ] 

Markus Jelsma commented on SOLR-1632:
-

It doesn't really seem to work, we're seeing lots of NPE's and if a response 
comes through IDF is not consistent for all terms. Most request return one of 
the NPE's below. Sometimes it works, and then the second request just fails.

{code}
java.lang.NullPointerException
at 
org.apache.solr.search.stats.ExactStatsCache.sendGlobalStats(LRUStatsCache.java:202)
at 
org.apache.solr.handler.component.QueryComponent.createMainQuery(QueryComponent.java:783)
at 
org.apache.solr.handler.component.QueryComponent.regularDistributedProcess(QueryComponent.java:618)
at...
{code}

{code}
java.lang.NullPointerException
at 
org.apache.solr.search.stats.LRUStatsCache.sendGlobalStats(LRUStatsCache.java:228)
at 
org.apache.solr.handler.component.QueryComponent.createMainQuery(QueryComponent.java:783)
at 
org.apache.solr.handler.component.QueryComponent.regularDistributedProcess(QueryComponent.java:618)
at...
{code}

We also see this one from time to time, it looks like this is thrown is there 
are `no servers hosting shard`:
{code}
java.lang.NullPointerException
at 
org.apache.solr.search.stats.LRUStatsCache.mergeToGlobalStats(LRUStatsCache.java:112)
at 
org.apache.solr.handler.component.QueryComponent.updateStats(QueryComponent.java:743)
at 
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:659)
at 
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:634)
at ..
{code}

It's also imposes a huge performance penalty with both LRUStatsCache and 
ExactStatsCache, if you're used to 40ms response times you'll see the average 
jump to 2 seconds with very frequent 5 second spikes. Performance stays poor if 
logging is disabled.

The logs are also swamped with logs like:
{code}
2013-02-20 11:54:48,091 WARN [search.stats.LRUStatsCache] - [http-8080-exec-5] 
- : ## Missing global colStats info: FIELD, using local
2013-02-20 11:54:48,091 WARN [search.stats.LRUStatsCache] - [http-8080-exec-5] 
- : ## Missing global termStats info: FIELD:TERM, using local
{code}

Both StatsCacheImpls behave like this. Each query logs lines like above. Maybe 
performance is poor because it tries to look up terms everytime but i'm not 
sure yet.


Finally something crazy i'd like to share :)
{code}
-Infinity = (MATCH) sum of:
  -Infinity = (MATCH) max plus 0.35 times others of:
-Infinity = (MATCH) weight(content_nl:amsterdam^1.6 in 449) [], result of:
  -Infinity = score(doc=449,freq=1.0 = termFreq=1.0
), product of:
1.6 = boost
-Infinity = idf(docFreq=29800090, docCount=-1)
1.0 = tfNorm, computed from:
  1.0 = termFreq=1.0
  1.2 = parameter k1
  0.0 = parameter b (norms omitted for field)
{code}

If someone happens to recognize the issues above, i'm all ears :)

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Fix For: 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, 
 distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4470) Support for basic http auth in internal solr requests

2013-02-20 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582144#comment-13582144
 ] 

Per Steffensen commented on SOLR-4470:
--

I think you should be able to specify credentials both on SolrServer-level (all 
requests made through this will have the same credentials added) and on 
SolrRequest-level (so that you can use the same SolrServer for sending requests 
with different credentials). I added a credentials-field on SolrRequest and it 
is all fine if you create the SolrRequest object yourself, but unfortunately 
there are a set of helper-methods on SolrServer that basically create the 
SolrRequest object for you without giving you a change to modify it afterwards. 
How do we prefer to hand over credentials for those SolrRequests? Ideas on 
the top of my head:
* 1) Add a credentials-param to all the helper-methods (maybe make two 
versions of each method - one that do and one that does not take a credentials 
object)
* 2) Change SolrRequest constructor so that it supports reading credentials 
from a threadlocal, that you will need to set before calling one of the 
helper-methods (instead of providing it as a parameter to the helper-method)

I wouldnt want to do 1) before agreed by the community, and 2) is kinda hacky 
(even though I like to use threadlocals a lot more than what the average 
developer seem to do). It seems like 1) was used back when commitWithinMs was 
added, but maybe it is not the way to continue - we will end up with huge set 
of similar (except for parameter differences) helper-methods. Actually I 
would have preferred that commitWithinMs was never made this way - maybe one 
should have foreseen that this is not the last parameter you want to be able to 
give to the helper-methods in general
3) Maybe back then you should have introduced a callback-thingy instead of the 
commitWithinMs-parameter that could be used for modifying the SolrRequest 
object after it had been set up by the helper-method
Of course 3) would also be an option now, but then we should really get rid of 
commitWithinMs and that will break API backwards compatibility.

 Support for basic http auth in internal solr requests
 -

 Key: SOLR-4470
 URL: https://issues.apache.org/jira/browse/SOLR-4470
 Project: Solr
  Issue Type: Bug
  Components: clients - java, multicore, replication (java), SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
  Labels: authentication, solrclient, solrcloud
 Fix For: 4.2


 We want to protect any HTTP-resource (url). We want to require credentials no 
 matter what kind of HTTP-request you make to a Solr-node.
 It can faily easy be acheived as described on 
 http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
 also make internal request to other Solr-nodes, and for it to work 
 credentials need to be provided here also.
 Ideally we would like to forward credentials from a particular request to 
 all the internal sub-requests it triggers. E.g. for search and update 
 request.
 But there are also internal requests
 * that only indirectly/asynchronously triggered from outside requests (e.g. 
 shard creation/deletion/etc based on calls to the Collection API)
 * that do not in any way have relation to an outside super-request (e.g. 
 replica synching stuff)
 We would like to aim at a solution where original credentials are 
 forwarded when a request directly/synchronously trigger a subrequest, and 
 fallback to a configured internal credentials for the 
 asynchronous/non-rooted requests.
 In our solution we would aim at only supporting basic http auth, but we would 
 like to make a framework around it, so that not to much refactoring is 
 needed if you later want to make support for other kinds of auth (e.g. digest)
 We will work at a solution but create this JIRA issue early in order to get 
 input/comments from the community as early as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.6.0_38) - Build # 2559 - Failure!

2013-02-20 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Windows/2559/
Java: 32bit/jdk1.6.0_38 -client -XX:+UseParallelGC

2 tests failed.
FAILED:  
org.apache.lucene.classification.SimpleNaiveBayesClassifierTest.testNGramUsage

Error Message:
expected:[74 65 63 68 6e 6f 6c 6f 67 79] but was:[]

Stack Trace:
java.lang.AssertionError: expected:[74 65 63 68 6e 6f 6c 6f 67 79] but 
was:[]
at 
__randomizedtesting.SeedInfo.seed([94CC85F3F99FB68A:32F21DF7DB203F05]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:147)
at 
org.apache.lucene.classification.ClassificationTestBase.checkCorrectClassification(ClassificationTestBase.java:68)
at 
org.apache.lucene.classification.SimpleNaiveBayesClassifierTest.testNGramUsage(SimpleNaiveBayesClassifierTest.java:38)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:662)


FAILED:  
org.apache.lucene.classification.SimpleNaiveBayesClassifierTest.testBasicUsage

Error Message:
expected:[74 65 63 68 6e 6f 6c 6f 67 79] but was:[]

Stack Trace:

[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_38) - Build # 4373 - Failure!

2013-02-20 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/4373/
Java: 32bit/jdk1.6.0_38 -client -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 20516 lines...]
-check-forbidden-test-apis:
[forbidden-apis] Reading API signatures: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/tools/forbiddenApis/tests.txt
[forbidden-apis] Loading classes to check...
[forbidden-apis] Scanning for API signatures and dependencies...
[forbidden-apis] Forbidden method invocation: java.util.Random#init()
[forbidden-apis]   in org.apache.lucene.classification.utils.DataSplitterTest 
(DataSplitterTest.java:65)
[forbidden-apis] Forbidden method invocation: java.util.Random#init()
[forbidden-apis]   in org.apache.lucene.classification.utils.DataSplitterTest 
(DataSplitterTest.java:70)
[forbidden-apis] Forbidden method invocation: java.util.Random#init()
[forbidden-apis]   in org.apache.lucene.classification.utils.DataSplitterTest 
(DataSplitterTest.java:71)
[forbidden-apis] Forbidden method invocation: java.util.Random#init()
[forbidden-apis]   in org.apache.lucene.classification.utils.DataSplitterTest 
(DataSplitterTest.java:71)
[forbidden-apis] Forbidden method invocation: java.util.Random#init()
[forbidden-apis]   in org.apache.lucene.classification.utils.DataSplitterTest 
(DataSplitterTest.java:71)
[forbidden-apis] Forbidden method invocation: java.util.Random#init()
[forbidden-apis]   in org.apache.lucene.classification.utils.DataSplitterTest 
(DataSplitterTest.java:73)
[forbidden-apis] Forbidden method invocation: java.util.Random#init()
[forbidden-apis]   in org.apache.lucene.classification.utils.DataSplitterTest 
(DataSplitterTest.java:111)
[forbidden-apis] Scanned 2281 (and 1631 related) class file(s) for forbidden 
API invocations (in 1.32s), 7 error(s).

BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:381: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:67: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build.xml:186: Check 
for forbidden API calls failed, see log.

Total time: 47 minutes 44 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.6.0_38 -client -XX:+UseParallelGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.6.0_38) - Build # 4349 - Failure!

2013-02-20 Thread Michael McCandless
Duh!  First off, I didn't intend to commit this ... it was my attempt
to reproduce LUCENE-4775.  That method (merge.totalBytesSize) should
have thrown an exception eventually ...

But second off, no wonder I couldn't reproduce it!!  (not passing IWC).

Thanks for fixing Rob.

Mike McCandless

http://blog.mikemccandless.com

On Sun, Feb 17, 2013 at 10:28 AM, Robert Muir rcm...@gmail.com wrote:
 This test is broken.

 IndexWriterConfig iwc = newIndexWriterConfig(TEST_VERSION_CURRENT,
 new MockAnalyzer(random()));
 iwc.setMaxBufferedDocs(5);
 iwc.setMergeScheduler(new TrackingCMS());
 RandomIndexWriter w = new RandomIndexWriter(random(), d); -- NOT
 using the IWC!!!

 then goes to index 100,000 docs. in this case it got serial merge
 scheduler, and the fields in the docs got term vectors.

 even if this test were to use the iwc (so it uses its crazy
 TrackingCMS), then i still don't understand what its testing. all that
 TrackingCMS does is keep summing up bytes merged into an unused
 variable.

 On Sun, Feb 17, 2013 at 8:16 AM, Policeman Jenkins Server
 jenk...@thetaphi.de wrote:
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/4349/
 Java: 32bit/jdk1.6.0_38 -server -XX:+UseSerialGC

 3 tests failed.
 FAILED:  
 junit.framework.TestSuite.org.apache.lucene.index.TestConcurrentMergeScheduler

 Error Message:
 Suite timeout exceeded (= 720 msec).

 Stack Trace:
 java.lang.Exception: Suite timeout exceeded (= 720 msec).
 at __randomizedtesting.SeedInfo.seed([9794B128B2451E9F]:0)


 REGRESSION:  
 org.apache.lucene.index.TestConcurrentMergeScheduler.testTotalBytesSize

 Error Message:
 Test abandoned because suite timeout was reached.

 Stack Trace:
 java.lang.Exception: Test abandoned because suite timeout was reached.
 at __randomizedtesting.SeedInfo.seed([9794B128B2451E9F]:0)


 REGRESSION:  org.apache.lucene.util.TestMaxFailuresRule.testMaxFailures

 Error Message:
 expected:500 but was:0

 Stack Trace:
 java.lang.AssertionError: expected:500 but was:0
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.lucene.util.TestMaxFailuresRule.testMaxFailures(TestMaxFailuresRule.java:103)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
 at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
 at org.junit.rules.RunRules.evaluate(RunRules.java:18)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
 at 
 com.carrotsearch.ant.tasks.junit4.slave.SlaveMain.execute(SlaveMain.java:180)
 at 
 com.carrotsearch.ant.tasks.junit4.slave.SlaveMain.main(SlaveMain.java:275)
 at 
 com.carrotsearch.ant.tasks.junit4.slave.SlaveMainSafe.main(SlaveMainSafe.java:12)




 Build Log:
 [...truncated 1388 lines...]
 [junit4:junit4] Suite: org.apache.lucene.index.TestConcurrentMergeScheduler
 [junit4:junit4]   2 

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.6.0_38) - Build # 4349 - Failure!

2013-02-20 Thread Robert Muir
I'm not sure i really fixed it!

I fixed IWC to use this mergescheduler and for the test to not be so
slow, but i noticed the value it always got for totalBytesSize is 0...

I didnt have time to dig in but this seems screwy...

On Wed, Feb 20, 2013 at 4:49 AM, Michael McCandless
luc...@mikemccandless.com wrote:
 Duh!  First off, I didn't intend to commit this ... it was my attempt
 to reproduce LUCENE-4775.  That method (merge.totalBytesSize) should
 have thrown an exception eventually ...

 But second off, no wonder I couldn't reproduce it!!  (not passing IWC).

 Thanks for fixing Rob.

 Mike McCandless

 http://blog.mikemccandless.com

 On Sun, Feb 17, 2013 at 10:28 AM, Robert Muir rcm...@gmail.com wrote:
 This test is broken.

 IndexWriterConfig iwc = newIndexWriterConfig(TEST_VERSION_CURRENT,
 new MockAnalyzer(random()));
 iwc.setMaxBufferedDocs(5);
 iwc.setMergeScheduler(new TrackingCMS());
 RandomIndexWriter w = new RandomIndexWriter(random(), d); -- NOT
 using the IWC!!!

 then goes to index 100,000 docs. in this case it got serial merge
 scheduler, and the fields in the docs got term vectors.

 even if this test were to use the iwc (so it uses its crazy
 TrackingCMS), then i still don't understand what its testing. all that
 TrackingCMS does is keep summing up bytes merged into an unused
 variable.

 On Sun, Feb 17, 2013 at 8:16 AM, Policeman Jenkins Server
 jenk...@thetaphi.de wrote:
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/4349/
 Java: 32bit/jdk1.6.0_38 -server -XX:+UseSerialGC

 3 tests failed.
 FAILED:  
 junit.framework.TestSuite.org.apache.lucene.index.TestConcurrentMergeScheduler

 Error Message:
 Suite timeout exceeded (= 720 msec).

 Stack Trace:
 java.lang.Exception: Suite timeout exceeded (= 720 msec).
 at __randomizedtesting.SeedInfo.seed([9794B128B2451E9F]:0)


 REGRESSION:  
 org.apache.lucene.index.TestConcurrentMergeScheduler.testTotalBytesSize

 Error Message:
 Test abandoned because suite timeout was reached.

 Stack Trace:
 java.lang.Exception: Test abandoned because suite timeout was reached.
 at __randomizedtesting.SeedInfo.seed([9794B128B2451E9F]:0)


 REGRESSION:  org.apache.lucene.util.TestMaxFailuresRule.testMaxFailures

 Error Message:
 expected:500 but was:0

 Stack Trace:
 java.lang.AssertionError: expected:500 but was:0
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.lucene.util.TestMaxFailuresRule.testMaxFailures(TestMaxFailuresRule.java:103)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
 at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
 at org.junit.rules.RunRules.evaluate(RunRules.java:18)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
 at 
 com.carrotsearch.ant.tasks.junit4.slave.SlaveMain.execute(SlaveMain.java:180)
 at 

[jira] [Reopened] (LUCENE-4781) Backport classification module to branch_4x

2013-02-20 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili reopened LUCENE-4781:
-


reopening as :

ant test  -Dtestcase=SimpleNaiveBayesClassifierTest 
-Dtests.method=testBasicUsage -Dtests.seed=94CC85F3F99FB68A -Dtests.slow=true 
-Dtests.locale=ar_BH -Dtests.timezone=Australia/South 
-Dtests.file.encoding=UTF-8

make SNBC tests fail.

 Backport classification module to branch_4x
 ---

 Key: LUCENE-4781
 URL: https://issues.apache.org/jira/browse/LUCENE-4781
 Project: Lucene - Core
  Issue Type: Task
  Components: modules/classification
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 4.2

 Attachments: LUCENE-4781.patch


 Backport lucene/classification from trunk to branch_4x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4470) Support for basic http auth in internal solr requests

2013-02-20 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582144#comment-13582144
 ] 

Per Steffensen edited comment on SOLR-4470 at 2/20/13 1:29 PM:
---

I think you should be able to specify credentials both on SolrServer-level (all 
requests made through this will have the same credentials added) and on 
SolrRequest-level (so that you can use the same SolrServer for sending requests 
with different credentials). I added a credentials-field on SolrRequest and it 
is all fine if you create the SolrRequest object yourself, but unfortunately 
there are a set of helper-methods on SolrServer that basically create the 
SolrRequest object for you without giving you a change to modify it afterwards. 
How do we prefer to hand over credentials for those SolrRequests? Ideas on 
the top of my head:
* 1) Add a credentials-param to all the helper-methods (maybe make two 
versions of each method - one that do and one that does not take a credentials 
object)
* 2) Change SolrRequest constructor so that it supports reading credentials 
from a threadlocal, that you will need to set before calling one of the 
helper-methods (instead of providing it as a parameter to the helper-method)

I wouldnt want to do 1) before agreed by the community, and 2) is kinda hacky 
(even though I like to use threadlocals a lot more than what the average 
developer seem to do). It seems like 1) was used back when commitWithinMs was 
added, but maybe it is not the way to continue - we will end up with huge set 
of similar (except for parameter differences) helper-methods. Actually I 
would have preferred that commitWithinMs was never made this way - maybe one 
should have foreseen that this is not the last parameter you want to be able to 
give to the helper-methods in general, so maybe back then you should have 
introduced a callback-thingy instead of the commitWithinMs-parameter - a 
callback-thingy that could be used for modifying the SolrRequest object after 
it had been set up by the helper-method.
* 3) Of course that is also an option now, but then we should really get rid of 
commitWithinMs and that will break API backwards compatibility.

  was (Author: steff1193):
I think you should be able to specify credentials both on SolrServer-level 
(all requests made through this will have the same credentials added) and on 
SolrRequest-level (so that you can use the same SolrServer for sending requests 
with different credentials). I added a credentials-field on SolrRequest and it 
is all fine if you create the SolrRequest object yourself, but unfortunately 
there are a set of helper-methods on SolrServer that basically create the 
SolrRequest object for you without giving you a change to modify it afterwards. 
How do we prefer to hand over credentials for those SolrRequests? Ideas on 
the top of my head:
* 1) Add a credentials-param to all the helper-methods (maybe make two 
versions of each method - one that do and one that does not take a credentials 
object)
* 2) Change SolrRequest constructor so that it supports reading credentials 
from a threadlocal, that you will need to set before calling one of the 
helper-methods (instead of providing it as a parameter to the helper-method)

I wouldnt want to do 1) before agreed by the community, and 2) is kinda hacky 
(even though I like to use threadlocals a lot more than what the average 
developer seem to do). It seems like 1) was used back when commitWithinMs was 
added, but maybe it is not the way to continue - we will end up with huge set 
of similar (except for parameter differences) helper-methods. Actually I 
would have preferred that commitWithinMs was never made this way - maybe one 
should have foreseen that this is not the last parameter you want to be able to 
give to the helper-methods in general
3) Maybe back then you should have introduced a callback-thingy instead of the 
commitWithinMs-parameter that could be used for modifying the SolrRequest 
object after it had been set up by the helper-method
Of course 3) would also be an option now, but then we should really get rid of 
commitWithinMs and that will break API backwards compatibility.
  
 Support for basic http auth in internal solr requests
 -

 Key: SOLR-4470
 URL: https://issues.apache.org/jira/browse/SOLR-4470
 Project: Solr
  Issue Type: Bug
  Components: clients - java, multicore, replication (java), SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
  Labels: authentication, solrclient, solrcloud
 Fix For: 4.2


 We want to protect any HTTP-resource (url). We want to require credentials no 
 matter what kind of HTTP-request you make to a Solr-node.
 It can faily easy be acheived as 

[jira] [Commented] (LUCENE-3918) Port index sorter to trunk APIs

2013-02-20 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582176#comment-13582176
 ] 

Adrien Grand commented on LUCENE-3918:
--

This patch seems to have been created against an old 4.x branch (4.0 or 4.1 
maybe?). We usually commit new features to trunk before backporting them to 
branch_4x, could you update the patch so that it can apply on top of trunk? 
Some comments on this new patch:
 - All Sorter implementations open the provided Directory and close it before 
returning, shouldn't this interface directly take an IndexReader as an argument?
 - SorterUtil.sort uses the stored fields API to create a new sorted index, 
this won't work in a few cases, especially if fields are not stored. I think it 
should rather use {{IndexWriter.addIndexes(IndexReader...)}}.
 - SortingIndexReader constructor expects a CompositeIndexReader and calls 
{{new SlowCompositeReaderWrapper()}} to have an atomic view of this reader. I 
think it should take any index reader and wrap it using 
{{SlowCompositeReaderWrapper.wrap}} (compared to {{new 
SlowCompositeReaderWrapper()}}, this optimizes the case where the composite 
reader only wraps a single atomic reader).
 - Why does SortingIndexReader.getLiveDocs always return null?

 Port index sorter to trunk APIs
 ---

 Key: LUCENE-3918
 URL: https://issues.apache.org/jira/browse/LUCENE-3918
 Project: Lucene - Core
  Issue Type: Task
  Components: modules/other
Affects Versions: 4.0-ALPHA
Reporter: Robert Muir
 Fix For: 4.2, 5.0

 Attachments: LUCENE-3918.patch, LUCENE-3918.patch


 LUCENE-2482 added an IndexSorter to 3.x, but we need to port this
 functionality to 4.0 apis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-02-20 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582178#comment-13582178
 ] 

Mark Miller commented on SOLR-1632:
---

Hmm, that makes it look like the current tests for this must be pretty weak 
then.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Fix For: 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, 
 distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4476) Bold on bold doesn't show highlighting on /browse

2013-02-20 Thread Upayavira (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Upayavira updated SOLR-4476:


Attachment: SOLR-4476.patch

Patch to switch highlighting from bold to italics in /browse

 Bold on bold doesn't show highlighting on /browse
 -

 Key: SOLR-4476
 URL: https://issues.apache.org/jira/browse/SOLR-4476
 Project: Solr
  Issue Type: Bug
Reporter: Upayavira
Priority: Trivial
 Attachments: SOLR-4476.patch


 Hit highlighting is enabled for /browse, but you can't see the results of it 
 because the field is already in bold. The attached (trivial) patch changes to 
 italics so you can actually see highlighting functioning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4476) Bold on bold doesn't show highlighting on /browse

2013-02-20 Thread Upayavira (JIRA)
Upayavira created SOLR-4476:
---

 Summary: Bold on bold doesn't show highlighting on /browse
 Key: SOLR-4476
 URL: https://issues.apache.org/jira/browse/SOLR-4476
 Project: Solr
  Issue Type: Bug
Reporter: Upayavira
Priority: Trivial
 Attachments: SOLR-4476.patch

Hit highlighting is enabled for /browse, but you can't see the results of it 
because the field is already in bold. The attached (trivial) patch changes to 
italics so you can actually see highlighting functioning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-02-20 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582188#comment-13582188
 ] 

Markus Jelsma commented on SOLR-1632:
-

Things have changed a lot in the past 13 months and i haven't figured it all 
out yet. I'll try to make sense out of it but some expert opinion and trial on 
the patch and all would be more than helpful. Is Andrzej not around? 

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Fix For: 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, 
 distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4782) Let the NaiveBayes classifier have a fallback docCount method if codec doesn't support Terms#docCount()

2013-02-20 Thread Tommaso Teofili (JIRA)
Tommaso Teofili created LUCENE-4782:
---

 Summary: Let the NaiveBayes classifier have a fallback docCount 
method if codec doesn't support Terms#docCount()
 Key: LUCENE-4782
 URL: https://issues.apache.org/jira/browse/LUCENE-4782
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/classification
Reporter: Tommaso Teofili
 Fix For: 4.2, 5.0


In _SimpleNaiveBayesClassifier_ _docsWithClassSize_ variable is initialized to 
_MultiFields.getTerms(this.atomicReader, this.classFieldName).getDocCount()_ 
which may be -1 if the codec doesn't support doc counts, therefore there should 
be an alternative way to initialize such a variable with the documents count.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4781) Backport classification module to branch_4x

2013-02-20 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili resolved LUCENE-4781.
-

Resolution: Fixed

resolving as the latest failure is related to LUCENE-4782

 Backport classification module to branch_4x
 ---

 Key: LUCENE-4781
 URL: https://issues.apache.org/jira/browse/LUCENE-4781
 Project: Lucene - Core
  Issue Type: Task
  Components: modules/classification
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 4.2

 Attachments: LUCENE-4781.patch


 Backport lucene/classification from trunk to branch_4x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2013-02-20 Thread Dmitry Kan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582207#comment-13582207
 ] 

Dmitry Kan commented on LUCENE-1486:


OK, after some study, here is what we did:

we treat the AND clauses as spanNearQuery objects. So, the

a AND b

becomes %a b%~slop, where %%~ operator is an unordered SpanNear query (change 
to QueryParser.jj was required for this).

When there is a case of NOT clause with nested clauses:

NOT( (a AND b) OR (c AND d) ) = NOT ( %a b%~slop OR %c d%~slop ) ,

we need to handle SpanNearQueries in the addComplexPhraseClause method. In 
order to handle this, we just added to the if statement:

[code]
if (qc instanceof BooleanQuery) {
[/code]

the following else if statement:

[code]
else if (childQuery instanceof SpanNearQuery) {
ors.add((SpanQuery)childQuery);
}
[/code]


 Wildcards, ORs etc inside Phrase queries
 

 Key: LUCENE-1486
 URL: https://issues.apache.org/jira/browse/LUCENE-1486
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 2.4
Reporter: Mark Harwood
Priority: Minor
 Fix For: 4.2, 5.0

 Attachments: ComplexPhraseQueryParser.java, 
 junit_complex_phrase_qp_07_21_2009.patch, 
 junit_complex_phrase_qp_07_22_2009.patch, Lucene-1486 non default 
 field.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
 LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
 TestComplexPhraseQuery.java


 An extension to the default QueryParser that overrides the parsing of 
 PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
 The implementation feels a little hacky - this is arguably better handled in 
 QueryParser itself. This works as a proof of concept  for much of the query 
 parser syntax. Examples from the Junit test include:
   checkMatches(\j*   smyth~\, 1,2); //wildcards and fuzzies 
 are OK in phrases
   checkMatches(\(jo* -john)  smith\, 2); // boolean logic 
 works
   checkMatches(\jo*  smith\~2, 1,2,3); // position logic 
 works.
   
   checkBadQuery(\jo*  id:1 smith\); //mixing fields in a 
 phrase is bad
   checkBadQuery(\jo* \smith\ \); //phrases inside phrases 
 is bad
   checkBadQuery(\jo* [sma TO smZ]\ \); //range queries 
 inside phrases not supported
 Code plus Junit test to follow...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4477) match-only query support (terms,wildcards,ranges) for docvalues fields.

2013-02-20 Thread Robert Muir (JIRA)
Robert Muir created SOLR-4477:
-

 Summary: match-only query support (terms,wildcards,ranges) for 
docvalues fields.
 Key: SOLR-4477
 URL: https://issues.apache.org/jira/browse/SOLR-4477
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.2
Reporter: Robert Muir
 Attachments: SOLR-4477.patch

Historically, you had to invert fields (indexed=true) to do any queries against 
them.

But now its possible to build a forward index for the field (docValues=true).
I think in many cases (e.g. a string field you only sort and match on), its 
unnecessary and wasteful
to force the user to also invert if they don't need scoring.

So I think solr should support match-only semantics in this case for 
term,wildcard,range,etc.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4477) match-only query support (terms,wildcards,ranges) for docvalues fields.

2013-02-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-4477:
--

Attachment: SOLR-4477.patch

initial patch

 match-only query support (terms,wildcards,ranges) for docvalues fields.
 ---

 Key: SOLR-4477
 URL: https://issues.apache.org/jira/browse/SOLR-4477
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.2
Reporter: Robert Muir
 Attachments: SOLR-4477.patch


 Historically, you had to invert fields (indexed=true) to do any queries 
 against them.
 But now its possible to build a forward index for the field (docValues=true).
 I think in many cases (e.g. a string field you only sort and match on), its 
 unnecessary and wasteful
 to force the user to also invert if they don't need scoring.
 So I think solr should support match-only semantics in this case for 
 term,wildcard,range,etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4782) Let the NaiveBayes classifier have a fallback docCount method if codec doesn't support Terms#docCount()

2013-02-20 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili resolved LUCENE-4782.
-

Resolution: Fixed
  Assignee: Tommaso Teofili

 Let the NaiveBayes classifier have a fallback docCount method if codec 
 doesn't support Terms#docCount()
 ---

 Key: LUCENE-4782
 URL: https://issues.apache.org/jira/browse/LUCENE-4782
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/classification
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 4.2, 5.0


 In _SimpleNaiveBayesClassifier_ _docsWithClassSize_ variable is initialized 
 to _MultiFields.getTerms(this.atomicReader, 
 this.classFieldName).getDocCount()_ which may be -1 if the codec doesn't 
 support doc counts, therefore there should be an alternative way to 
 initialize such a variable with the documents count.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4782) Let the NaiveBayes classifier have a fallback docCount method if codec doesn't support Terms#docCount()

2013-02-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582237#comment-13582237
 ] 

Robert Muir commented on LUCENE-4782:
-

I'm not sure we have to realistically worry about this too much.

It only applies to 3.x indexes: in general all current codecs support this 
statistic.

So another option is to simply add SuppressCodecs(Lucene3x) annotation to the 
classification module and document that you should run IndexUpgrader on any old 
3.x segments you have lying around.


 Let the NaiveBayes classifier have a fallback docCount method if codec 
 doesn't support Terms#docCount()
 ---

 Key: LUCENE-4782
 URL: https://issues.apache.org/jira/browse/LUCENE-4782
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/classification
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 4.2, 5.0


 In _SimpleNaiveBayesClassifier_ _docsWithClassSize_ variable is initialized 
 to _MultiFields.getTerms(this.atomicReader, 
 this.classFieldName).getDocCount()_ which may be -1 if the codec doesn't 
 support doc counts, therefore there should be an alternative way to 
 initialize such a variable with the documents count.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4470) Support for basic http auth in internal solr requests

2013-02-20 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582274#comment-13582274
 ] 

Per Steffensen commented on SOLR-4470:
--

Usually you structure URLs in you web-app by increasing level of detail from 
left to right. This way you can configure the web-container to handle the most 
obvious security-constraints.

Unfortunately, IMHO, Solr URLs are not structured by increasing level of 
detail - e.g. /solr/collection1/update should have been 
/solr/update/collection1 (I consider collection/replica as a higher level 
of detail than update). Due to limitations on url-patterns in 
security-constraint|web-resource-collection's in web.xml (or webdefault.xml 
for jetty) you cannot write e.g. url-pattern/solr/*/update/url-pattern, 
url-pattern*/update/url-pattern or url-pattern*update/url-pattern to 
protect updates - it is not allowed - you can only have * in the end or as part 
of an extension-pattern (*.ext - and the . needs to be there).

Therefore it is not possible (AFAIK) to configure the web-container to protect 
update-operation or select-operation etc. You can configure protection on 
all operations for a specific collection (but not specific operations 
cross-collections), but it is much more unlikely that that is what you want to 
do. Or by mentioning url-pattern/solr/collection-name/update/url-pattern 
for every single collection in your setup you can actually protect e.g. update, 
but that is not possible for those of us that have a dynamic/ever-changing set 
of collections.

Possible solutions from the top of my head
* 1) Make Solr URL structure right - e.g. /solr/update/collection1
* 2) Make obvious security constraints like protecting update or protecting 
search etc. impossible to be done by web.xml configuration, and leave it up to 
programmatic protection
I like 1) best, but is that at all feasible, or will it just be way to much 
work?
Since Solr is usually not something you change yourself, but something you use 
out-of-the-box, potentially modifying deployment-descriptors (e.g. web.xml), 
config files etc. 2) will really not help the normal Solr user and it will 
also be a problem figuring out exactly where to place this programmatic 
protection-code, because despite most Solr-stuff is handled by 
SolrDispatchFilter there are several resources that is not handled through it.


 Support for basic http auth in internal solr requests
 -

 Key: SOLR-4470
 URL: https://issues.apache.org/jira/browse/SOLR-4470
 Project: Solr
  Issue Type: Bug
  Components: clients - java, multicore, replication (java), SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
  Labels: authentication, solrclient, solrcloud
 Fix For: 4.2


 We want to protect any HTTP-resource (url). We want to require credentials no 
 matter what kind of HTTP-request you make to a Solr-node.
 It can faily easy be acheived as described on 
 http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
 also make internal request to other Solr-nodes, and for it to work 
 credentials need to be provided here also.
 Ideally we would like to forward credentials from a particular request to 
 all the internal sub-requests it triggers. E.g. for search and update 
 request.
 But there are also internal requests
 * that only indirectly/asynchronously triggered from outside requests (e.g. 
 shard creation/deletion/etc based on calls to the Collection API)
 * that do not in any way have relation to an outside super-request (e.g. 
 replica synching stuff)
 We would like to aim at a solution where original credentials are 
 forwarded when a request directly/synchronously trigger a subrequest, and 
 fallback to a configured internal credentials for the 
 asynchronous/non-rooted requests.
 In our solution we would aim at only supporting basic http auth, but we would 
 like to make a framework around it, so that not to much refactoring is 
 needed if you later want to make support for other kinds of auth (e.g. digest)
 We will work at a solution but create this JIRA issue early in order to get 
 input/comments from the community as early as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4478) Allow cores to specify a named config set

2013-02-20 Thread Erick Erickson (JIRA)
Erick Erickson created SOLR-4478:


 Summary: Allow cores to specify a named config set
 Key: SOLR-4478
 URL: https://issues.apache.org/jira/browse/SOLR-4478
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.2, 5.0
Reporter: Erick Erickson
Assignee: Erick Erickson


Part of moving forward to the new way, after SOLR-4196 etc... I propose an 
additional parameter specified on the core node in solr.xml or as a parameter 
in the discovery mode core.properties file, call it configSet, where the 
value provided is a path to a directory, either absolute or relative. Really, 
this is as though you copied the conf directory somewhere to be used by more 
than one core.

Straw-man: There will be a directory solr_home/configsets which will be the 
default. If the configSet parameter is, say, myconf, then I'd expect a 
directory named myconf to exist in solr_home/configsets, which would look 
something like
solr_home/configsets/myconf/schema.xml
  solrconfig.xml
  stopwords.txt
  velocity
  velocity/query.vm

etc.

If multiple cores used the same configSet, schema, solrconfig etc. would all be 
shared (i.e. shareSchema=true would be assumed). I don't see a good use-case 
for _not_ sharing schemas, so I don't propose to allow this to be turned off. 
Hmmm, what if shareSchema is explicitly set to false in the solr.xml or 
properties file? I'd guess it should be honored but maybe log a warning?

Mostly I'm putting this up for comments. I know that there are already thoughts 
about how this all should work floating around, so before I start any work on 
this I thought I'd at least get an idea of whether this is the way people are 
thinking about going.

Configset can be either a relative or absolute path, if relative it's assumed 
to be relative to solr_home.

Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #252: POMs out of sync

2013-02-20 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/252/

No tests ran.

Build Log:
[...truncated 2394 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-4470) Support for basic http auth in internal solr requests

2013-02-20 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582274#comment-13582274
 ] 

Per Steffensen edited comment on SOLR-4470 at 2/20/13 4:13 PM:
---

Usually you structure URLs in you web-app by increasing level of detail from 
left to right. This way you can configure the web-container to handle the most 
obvious security-constraints.

Unfortunately, IMHO, Solr URLs are not structured by increasing level of 
detail - e.g. /solr/collection1/update should have been 
/solr/update/collection1 (I consider collection/replica as a higher level 
of detail than update). Due to limitations on url-patterns in 
security-constraint|web-resource-collection's in web.xml (or webdefault.xml 
for jetty) you cannot write e.g. url-pattern/solr/\*/update/url-pattern, 
url-pattern\*/update/url-pattern or url-pattern\*update/url-pattern to 
protect updates - it is not allowed - you can only have * in the end or as part 
of an extension-pattern (*.ext - and the . needs to be there).

Therefore it is not possible (AFAIK) to configure the web-container to protect 
update-operation or select-operation etc. You can configure protection on 
all operations for a specific collection (but not specific operations 
cross-collections), but it is much more unlikely that that is what you want to 
do. Or by mentioning url-pattern/solr/collection-name/update/url-pattern 
for every single collection in your setup you can actually protect e.g. update, 
but that is not possible for those of us that have a dynamic/ever-changing set 
of collections.

Possible solutions from the top of my head
* 1) Make Solr URL structure right - e.g. /solr/update/collection1
* 2) Make obvious security constraints like protecting update or protecting 
search etc. impossible to be done by web.xml configuration, and leave it up to 
programmatic protection
I like 1) best, but is that at all feasible, or will it just be way to much 
work?
Since Solr is usually not something you change yourself, but something you use 
out-of-the-box, potentially modifying deployment-descriptors (e.g. web.xml), 
config files etc. 2) will really not help the normal Solr user and it will 
also be a problem figuring out exactly where to place this programmatic 
protection-code, because despite most Solr-stuff is handled by 
SolrDispatchFilter there are several resources that is not handled through it.


  was (Author: steff1193):
Usually you structure URLs in you web-app by increasing level of detail 
from left to right. This way you can configure the web-container to handle the 
most obvious security-constraints.

Unfortunately, IMHO, Solr URLs are not structured by increasing level of 
detail - e.g. /solr/collection1/update should have been 
/solr/update/collection1 (I consider collection/replica as a higher level 
of detail than update). Due to limitations on url-patterns in 
security-constraint|web-resource-collection's in web.xml (or webdefault.xml 
for jetty) you cannot write e.g. url-pattern/solr/*/update/url-pattern, 
url-pattern*/update/url-pattern or url-pattern*update/url-pattern to 
protect updates - it is not allowed - you can only have * in the end or as part 
of an extension-pattern (*.ext - and the . needs to be there).

Therefore it is not possible (AFAIK) to configure the web-container to protect 
update-operation or select-operation etc. You can configure protection on 
all operations for a specific collection (but not specific operations 
cross-collections), but it is much more unlikely that that is what you want to 
do. Or by mentioning url-pattern/solr/collection-name/update/url-pattern 
for every single collection in your setup you can actually protect e.g. update, 
but that is not possible for those of us that have a dynamic/ever-changing set 
of collections.

Possible solutions from the top of my head
* 1) Make Solr URL structure right - e.g. /solr/update/collection1
* 2) Make obvious security constraints like protecting update or protecting 
search etc. impossible to be done by web.xml configuration, and leave it up to 
programmatic protection
I like 1) best, but is that at all feasible, or will it just be way to much 
work?
Since Solr is usually not something you change yourself, but something you use 
out-of-the-box, potentially modifying deployment-descriptors (e.g. web.xml), 
config files etc. 2) will really not help the normal Solr user and it will 
also be a problem figuring out exactly where to place this programmatic 
protection-code, because despite most Solr-stuff is handled by 
SolrDispatchFilter there are several resources that is not handled through it.

  
 Support for basic http auth in internal solr requests
 -

 Key: SOLR-4470
 URL: 

[jira] [Comment Edited] (SOLR-4470) Support for basic http auth in internal solr requests

2013-02-20 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582274#comment-13582274
 ] 

Per Steffensen edited comment on SOLR-4470 at 2/20/13 4:16 PM:
---

Usually you structure URLs in you web-app by increasing level of detail from 
left to right. This way you can configure the web-container to handle the most 
obvious security-constraints.

Unfortunately, IMHO, Solr URLs are not structured by increasing level of 
detail - e.g. /solr/collection1/update should have been 
/solr/update/collection1 (I consider collection/replica as a higher level 
of detail than update). Due to limitations on url-patterns in 
security-constraint|web-resource-collection's in web.xml (or webdefault.xml 
for jetty) you cannot write e.g. url-pattern/solr/\*/update/url-pattern, 
url-pattern\*/update/url-pattern or url-pattern\*update/url-pattern to 
protect updates - it is not allowed - you can only have * in the end or as part 
of an extension-pattern (*.ext - and the . needs to be there).

Therefore it is not possible (AFAIK) to configure the web-container to protect 
update-operation or select-operation etc. You can configure protection on 
all operations for a specific collection (but not specific operations 
cross-collections), but it is much more unlikely that that is what you want to 
do. Or by mentioning url-pattern/solr/collection-name/update/url-pattern 
for every single collection in your setup you can actually protect e.g. update, 
but that is not possible for those of us that have a dynamic/ever-changing set 
of collections.

Possible solutions from the top of my head
* 1) Make Solr URL structure right - e.g. /solr/update/collection1
* 2) Make obvious security constraints like protecting update or protecting 
search etc. impossible to be done by web.xml configuration, and leave it up to 
programmatic protection

I like 1) best, but is that at all feasible, or will it just be way to much 
work?
Since Solr is usually not something you change yourself, but something you use 
out-of-the-box, potentially modifying deployment-descriptors (e.g. web.xml), 
config files etc. 2) will really not help the normal Solr user and it will 
also be a problem figuring out exactly where to place this programmatic 
protection-code, because despite most Solr-stuff is handled by 
SolrDispatchFilter there are several resources that is not handled through it.


  was (Author: steff1193):
Usually you structure URLs in you web-app by increasing level of detail 
from left to right. This way you can configure the web-container to handle the 
most obvious security-constraints.

Unfortunately, IMHO, Solr URLs are not structured by increasing level of 
detail - e.g. /solr/collection1/update should have been 
/solr/update/collection1 (I consider collection/replica as a higher level 
of detail than update). Due to limitations on url-patterns in 
security-constraint|web-resource-collection's in web.xml (or webdefault.xml 
for jetty) you cannot write e.g. url-pattern/solr/\*/update/url-pattern, 
url-pattern\*/update/url-pattern or url-pattern\*update/url-pattern to 
protect updates - it is not allowed - you can only have * in the end or as part 
of an extension-pattern (*.ext - and the . needs to be there).

Therefore it is not possible (AFAIK) to configure the web-container to protect 
update-operation or select-operation etc. You can configure protection on 
all operations for a specific collection (but not specific operations 
cross-collections), but it is much more unlikely that that is what you want to 
do. Or by mentioning url-pattern/solr/collection-name/update/url-pattern 
for every single collection in your setup you can actually protect e.g. update, 
but that is not possible for those of us that have a dynamic/ever-changing set 
of collections.

Possible solutions from the top of my head
* 1) Make Solr URL structure right - e.g. /solr/update/collection1
* 2) Make obvious security constraints like protecting update or protecting 
search etc. impossible to be done by web.xml configuration, and leave it up to 
programmatic protection
I like 1) best, but is that at all feasible, or will it just be way to much 
work?
Since Solr is usually not something you change yourself, but something you use 
out-of-the-box, potentially modifying deployment-descriptors (e.g. web.xml), 
config files etc. 2) will really not help the normal Solr user and it will 
also be a problem figuring out exactly where to place this programmatic 
protection-code, because despite most Solr-stuff is handled by 
SolrDispatchFilter there are several resources that is not handled through it.

  
 Support for basic http auth in internal solr requests
 -

 Key: SOLR-4470
 URL: 

[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer

2013-02-20 Thread philip hoy (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582304#comment-13582304
 ] 

philip hoy commented on SOLR-4449:
--

Hi Raintung, I don't think there will be that many threads created, although it 
depends somewhat on how the injected ThreadPoolExecuter and HttpClient are 
configured. 

For the good case, where the first shard responds within the delay period, 
there is I think one thread for the orchestration code and one to send and 
receive the request.

For the bad case, where there are three shards all responding slowly and where 
the maximumConcurrentRequests is set to 3 or higher then it will again use one 
thread for the the orchestration code but this time there will be three threads 
used to send and receive responses for the three requests.



 Enable backup requests for the internal solr load balancer
 --

 Key: SOLR-4449
 URL: https://issues.apache.org/jira/browse/SOLR-4449
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: philip hoy
Priority: Minor
 Attachments: SOLR-4449.patch


 Add the ability to configure the built-in solr load balancer such that it 
 submits a backup request to the next server in the list if the initial 
 request takes too long. Employing such an algorithm could improve the latency 
 of the 9xth percentile albeit at the expense of increasing overall load due 
 to additional requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4196) Untangle XML-specific nature of Config and Container classes

2013-02-20 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582305#comment-13582305
 ] 

Mark Miller commented on SOLR-4196:
---

bq. with the problem that seems to come up intermittently

Are you seeing this on your machine? If so, please file or add to a JIRA issue 
for that test with the details.

 Untangle XML-specific nature of Config and Container classes
 

 Key: SOLR-4196
 URL: https://issues.apache.org/jira/browse/SOLR-4196
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 4.2, 5.0

 Attachments: SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, StressTest.zip, StressTest.zip, StressTest.zip


 sub-task for SOLR-4083. If we're going to try to obsolete solr.xml, we need 
 to pull all of the specific XML processing out of Config and Container. 
 Currently, we refer to xpaths all over the place. This JIRA is about 
 providing a thunking layer to isolate the XML-esque nature of solr.xml and 
 allow a simple properties file to be used instead which will lead, 
 eventually, to solr.xml going away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4196) Untangle XML-specific nature of Config and Container classes

2013-02-20 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582316#comment-13582316
 ] 

Erick Erickson commented on SOLR-4196:
--

bq: Are you seeing this on your machine? If so, please file or add to a JIRA 
issue for that test with the details.

Nope, just going from the emails to the dev list from Apache. Or maybe I'm 
remembering from some time ago

 Untangle XML-specific nature of Config and Container classes
 

 Key: SOLR-4196
 URL: https://issues.apache.org/jira/browse/SOLR-4196
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 4.2, 5.0

 Attachments: SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, StressTest.zip, StressTest.zip, StressTest.zip


 sub-task for SOLR-4083. If we're going to try to obsolete solr.xml, we need 
 to pull all of the specific XML processing out of Config and Container. 
 Currently, we refer to xpaths all over the place. This JIRA is about 
 providing a thunking layer to isolate the XML-esque nature of solr.xml and 
 allow a simple properties file to be used instead which will lead, 
 eventually, to solr.xml going away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4196) Untangle XML-specific nature of Config and Container classes

2013-02-20 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582324#comment-13582324
 ] 

Mark Miller commented on SOLR-4196:
---

Unless you see something fail locally without your changes, I wouldn't assume a 
test fail is unrelated. The Apache Jenkins fails on freebsd with blackhole are 
a whole different ballgame.

 Untangle XML-specific nature of Config and Container classes
 

 Key: SOLR-4196
 URL: https://issues.apache.org/jira/browse/SOLR-4196
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 4.2, 5.0

 Attachments: SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, StressTest.zip, StressTest.zip, StressTest.zip


 sub-task for SOLR-4083. If we're going to try to obsolete solr.xml, we need 
 to pull all of the specific XML processing out of Config and Container. 
 Currently, we refer to xpaths all over the place. This JIRA is about 
 providing a thunking layer to isolate the XML-esque nature of solr.xml and 
 allow a simple properties file to be used instead which will lead, 
 eventually, to solr.xml going away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4571) speedup disjunction with minShouldMatch

2013-02-20 Thread Stefan Pohl (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Pohl updated LUCENE-4571:


Attachment: LUCENE-4571.patch

Robert, thank you for the excellent feedback.

I didn't look at the BS in detail for a while, and all you say sounds very 
reasonable. My statement improvement by a factor was under the assumption 
that BS would similarly to BS2 generate all OR-candidates and only afterwards 
screen out many of them again due to the minimum-match constraint. If that's 
the case and we assume BS to be faster than BS2 by some factor, then it will be 
the very same factor faster with a larger collection, whereas an optimized BS2 
might become faster than BS because it does not generate many useless 
candidates for queries that have only one super-common term (proportional to 
document collection size) and a minimum-match constraint of at least 2. Hope 
this makes sense now, but of course my assumption might be wrong :)

In fact, I got to think quite a bit about different approaches on how to 
implement a minimum-match optimized version of BS2 and converged on 
implementation approach 1) due to the others adding other expensive 
operations/overhead when getting down to all details. The attached patch 
contains a full drop-in replacement for the DisjunctionSumScorer in 
DisjunctionSumScorerMM.java and accordingly changes references within 
BooleanScorer2. All existing tests pass.
As this scorer is supposed to work with any subclauses (not only TermQuery), I 
decided for an implementation that dynamically orders the first mm-1 subscorers 
by the next docid, hence exploiting local within-inverted-list docid 
distributions. Fixing the mm-1 subscorers on basis of their doc 
frequency/sparseness estimation could be better (less heap operations, but no 
exploitation of within-list docid distributions), but this is currently only 
available for TermQuery as clauses and would hence limit the applicability of 
the implementation. Having an API to determine the sparseness of a subscorer 
already came up in some other tickets and many other Scorer implementations 
could similarly benefit from its availability.

I however share your thinking about not intermingling too many aspects within 
one Scorer, making it overly complex and probably less amenable for VM 
optimizations (e.g. not as tight loops). This is why I implemented it in a 
different class, so you could go ahead and remove the mm-constraint from 
DisjunctionSumScorer and use either implementation depending on the given query.
Also, this implementation could still be tested response-time-wise for 
different representative queries of interest and different mm-constraint 
settings (luceneutil?). I wouldn't be surprised if my implementation is not as 
quick as DisjunctionSumScorer when mm=1, but I have anecdotal evidence that it 
does a great job (speedups of 2-3) for higher mm (and longer queries).
I don't quite see why the attached implementation would do more heap 
operations, but I agree that it could be slower due to lengthier/more complex 
loops, a few more if statements etc.

Hope this contribution helps.

 speedup disjunction with minShouldMatch 
 

 Key: LUCENE-4571
 URL: https://issues.apache.org/jira/browse/LUCENE-4571
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.1
Reporter: Mikhail Khludnev
 Attachments: LUCENE-4571.patch


 even minShouldMatch is supplied to DisjunctionSumScorer it enumerates whole 
 disjunction, and verifies minShouldMatch condition [on every 
 doc|https://github.com/apache/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/DisjunctionSumScorer.java#L70]:
 {code}
   public int nextDoc() throws IOException {
 assert doc != NO_MORE_DOCS;
 while(true) {
   while (subScorers[0].docID() == doc) {
 if (subScorers[0].nextDoc() != NO_MORE_DOCS) {
   heapAdjust(0);
 } else {
   heapRemoveRoot();
   if (numScorers  minimumNrMatchers) {
 return doc = NO_MORE_DOCS;
   }
 }
   }
   afterNext();
   if (nrMatchers = minimumNrMatchers) {
 break;
   }
 }
 
 return doc;
   }
 {code}
 [~spo] proposes (as well as I get it) to pop nrMatchers-1 scorers from the 
 heap first, and then push them back advancing behind that top doc. For me the 
 question no.1 is there a performance test for minShouldMatch constrained 
 disjunction. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: 

[jira] [Created] (LUCENE-4783) Inconsistent results, changing based on recent previous searches (caching?)

2013-02-20 Thread William Johnson (JIRA)
William Johnson created LUCENE-4783:
---

 Summary: Inconsistent results, changing based on recent previous 
searches (caching?)
 Key: LUCENE-4783
 URL: https://issues.apache.org/jira/browse/LUCENE-4783
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.1
 Environment: Ubuntu Linux  Java application running under Tomcat
Reporter: William Johnson


We have several repeatable cases where Lucene is returning different candidates 
for the same search, on the same (static) index, depending on what other 
searches have been run before hand.

It appears as though Lucene is failing to find matches in some cases if they 
have not been cached by a previous search.

In specific (although it is happening with more than just fuzzy searches), a 
fuzzy search on a misspelled street name returns no result.  If you then search 
on the correctly spelled street name, and THEN return to the original fuzzy 
query on the original incorrect spelling, you now receive the result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4571) speedup disjunction with minShouldMatch

2013-02-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582337#comment-13582337
 ] 

Robert Muir commented on LUCENE-4571:
-

Stefan this looks very promising! I think we should add support for this query 
to luceneutil and try it out.

About the docfreq idea, a scorer/disi cost estimation patch exists in at least 
two places. For example termscorer returns docfreq. Disjunctions return sum 
over their subscorers. Actually this speeds up conjunctions in general and 
removes the need for the specialized conjunctiontermscorer. I think it would be 
useful here too? I'll find the link and add it to a comment in a bit

 speedup disjunction with minShouldMatch 
 

 Key: LUCENE-4571
 URL: https://issues.apache.org/jira/browse/LUCENE-4571
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.1
Reporter: Mikhail Khludnev
 Attachments: LUCENE-4571.patch


 even minShouldMatch is supplied to DisjunctionSumScorer it enumerates whole 
 disjunction, and verifies minShouldMatch condition [on every 
 doc|https://github.com/apache/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/DisjunctionSumScorer.java#L70]:
 {code}
   public int nextDoc() throws IOException {
 assert doc != NO_MORE_DOCS;
 while(true) {
   while (subScorers[0].docID() == doc) {
 if (subScorers[0].nextDoc() != NO_MORE_DOCS) {
   heapAdjust(0);
 } else {
   heapRemoveRoot();
   if (numScorers  minimumNrMatchers) {
 return doc = NO_MORE_DOCS;
   }
 }
   }
   afterNext();
   if (nrMatchers = minimumNrMatchers) {
 break;
   }
 }
 
 return doc;
   }
 {code}
 [~spo] proposes (as well as I get it) to pop nrMatchers-1 scorers from the 
 heap first, and then push them back advancing behind that top doc. For me the 
 question no.1 is there a performance test for minShouldMatch constrained 
 disjunction. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4571) speedup disjunction with minShouldMatch

2013-02-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582347#comment-13582347
 ] 

Robert Muir commented on LUCENE-4571:
-

Here's the most recent patch for what I discussed: LUCENE-4607

 speedup disjunction with minShouldMatch 
 

 Key: LUCENE-4571
 URL: https://issues.apache.org/jira/browse/LUCENE-4571
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.1
Reporter: Mikhail Khludnev
 Attachments: LUCENE-4571.patch


 even minShouldMatch is supplied to DisjunctionSumScorer it enumerates whole 
 disjunction, and verifies minShouldMatch condition [on every 
 doc|https://github.com/apache/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/DisjunctionSumScorer.java#L70]:
 {code}
   public int nextDoc() throws IOException {
 assert doc != NO_MORE_DOCS;
 while(true) {
   while (subScorers[0].docID() == doc) {
 if (subScorers[0].nextDoc() != NO_MORE_DOCS) {
   heapAdjust(0);
 } else {
   heapRemoveRoot();
   if (numScorers  minimumNrMatchers) {
 return doc = NO_MORE_DOCS;
   }
 }
   }
   afterNext();
   if (nrMatchers = minimumNrMatchers) {
 break;
   }
 }
 
 return doc;
   }
 {code}
 [~spo] proposes (as well as I get it) to pop nrMatchers-1 scorers from the 
 heap first, and then push them back advancing behind that top doc. For me the 
 question no.1 is there a performance test for minShouldMatch constrained 
 disjunction. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4784) Out of date API document-ValueSourceQuery

2013-02-20 Thread Hao Zhong (JIRA)
Hao Zhong created LUCENE-4784:
-

 Summary: Out of date API document-ValueSourceQuery
 Key: LUCENE-4784
 URL: https://issues.apache.org/jira/browse/LUCENE-4784
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/query/scoring
Affects Versions: 4.1
Reporter: Hao Zhong
Priority: Critical


The following API documents talk about ValueSourceQuery:
http://lucene.apache.org/core/4_1_0/queries/org/apache/lucene/queries/CustomScoreProvider.html
http://lucene.apache.org/core/4_1_0/queries/org/apache/lucene/queries/CustomScoreQuery.html
However, ValueSourceQuery is deleted in lucene 4.1, according to the following 
migration guide.
http://lucene.apache.org/core/4_1_0/MIGRATE.html
The following lists the replacement classes for those removed:
...  o.a.l.search.function.ValueSourceQuery - 
o.a.l.queries.function.FunctionQuery

Please update the API documents to reflect the latest code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4785) Out of date API document-RangeQuery

2013-02-20 Thread Hao Zhong (JIRA)
Hao Zhong created LUCENE-4785:
-

 Summary: Out of date API document-RangeQuery
 Key: LUCENE-4785
 URL: https://issues.apache.org/jira/browse/LUCENE-4785
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/queryparser
Affects Versions: 4.1
Reporter: Hao Zhong
Priority: Critical


The following API documents talk about RangeQuery:
http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/classic/QueryParserBase.html
http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/document/DateTools.html
However, RangeQuery is deleted in lucene 4.1, according to the change log:
http://lucene.apache.org/core/4_1_0/changes/Changes.html
 

LUCENE-1944, LUCENE-1856, LUCENE-1957, LUCENE-1960, LUCENE-1961, LUCENE-1968, 
LUCENE-1970, LUCENE-1946, LUCENE-1971, LUCENE-1975, LUCENE-1972, LUCENE-1978, 
LUCENE-944, LUCENE-1979, LUCENE-1973, LUCENE-2011: Remove deprecated 
methods/constructors/classes:
...  
Remove *RangeQuery*, RangeFilter and ConstantScoreRangeQuery. 

Please update the API documents to reflect the latest code.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4786) Out of date API document-SinkTokenizer

2013-02-20 Thread Hao Zhong (JIRA)
Hao Zhong created LUCENE-4786:
-

 Summary: Out of date API document-SinkTokenizer
 Key: LUCENE-4786
 URL: https://issues.apache.org/jira/browse/LUCENE-4786
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.1
Reporter: Hao Zhong


The following API document talks about SinkTokenizer:
http://lucene.apache.org/core/4_1_0/analyzers-common/org/apache/lucene/analysis/sinks/package-summary.html

However, SinkTokenizer is deleted and replaced by TeeSinkTokenFilter in lucene 
4.1, according to the change log:
http://lucene.apache.org/core/4_1_0/changes/Changes.html

LUCENE-1422, LUCENE-1693: New TokenStream API that uses a new class called 
AttributeSource instead of the Token class, which is now a utility class that 
holds common Token attributes. All attributes that the Token class had have 
been moved into separate classes: TermAttribute, OffsetAttribute, 
PositionIncrementAttribute, PayloadAttribute, TypeAttribute and FlagsAttribute. 
The new API is much more flexible; it allows to combine the Attributes 
arbitrarily and also to define custom Attributes. The new API has the same 
performance as the old next(Token) approach. *For conformance with this new API 
Tee-/SinkTokenizer was deprecated and replaced by a new TeeSinkTokenFilter*. 

Please update the API documents to reflect the latest code.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4196) Untangle XML-specific nature of Config and Container classes

2013-02-20 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582373#comment-13582373
 ] 

Erick Erickson commented on SOLR-4196:
--

I'm not communicating. There aren't any test failures on my machine so far. I 
was just hoping, vainly, that somehow re-arranging the CoreContainer code would 
expose whatever's been happening on Jenkins based _solely_ on similarity of 
error messages. Turns out, as you say, that probably isn't the case.

Tests with the changes in the latest patch are running just fine, at least 
once. I'll give them another couple of whirls after I've run through a few 
hours of the stress tests.



 Untangle XML-specific nature of Config and Container classes
 

 Key: SOLR-4196
 URL: https://issues.apache.org/jira/browse/SOLR-4196
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 4.2, 5.0

 Attachments: SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, StressTest.zip, StressTest.zip, StressTest.zip


 sub-task for SOLR-4083. If we're going to try to obsolete solr.xml, we need 
 to pull all of the specific XML processing out of Config and Container. 
 Currently, we refer to xpaths all over the place. This JIRA is about 
 providing a thunking layer to isolate the XML-esque nature of solr.xml and 
 allow a simple properties file to be used instead which will lead, 
 eventually, to solr.xml going away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4196) Untangle XML-specific nature of Config and Container classes

2013-02-20 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582385#comment-13582385
 ] 

Mark Miller commented on SOLR-4196:
---

I'm just letting you know for the future - I know you have brought up that test 
a couple times, and I'd be very interested in the fail if it was something you 
are seeing locally.

But if you are just going by what apache jenkins is reporting, that is a 
different ball of wax, and I have all the info I need.

So you should be able to trust those tests locally - and if you can't, that is 
something we should fix.

 Untangle XML-specific nature of Config and Container classes
 

 Key: SOLR-4196
 URL: https://issues.apache.org/jira/browse/SOLR-4196
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 4.2, 5.0

 Attachments: SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, StressTest.zip, StressTest.zip, StressTest.zip


 sub-task for SOLR-4083. If we're going to try to obsolete solr.xml, we need 
 to pull all of the specific XML processing out of Config and Container. 
 Currently, we refer to xpaths all over the place. This JIRA is about 
 providing a thunking layer to isolate the XML-esque nature of solr.xml and 
 allow a simple properties file to be used instead which will lead, 
 eventually, to solr.xml going away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4787) The QueryScorer.getMaxWeight method is not found.

2013-02-20 Thread Hao Zhong (JIRA)
Hao Zhong created LUCENE-4787:
-

 Summary: The QueryScorer.getMaxWeight method is not found.
 Key: LUCENE-4787
 URL: https://issues.apache.org/jira/browse/LUCENE-4787
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/highlighter
Affects Versions: 4.1
Reporter: Hao Zhong
Priority: Critical


The following API documents refer to the QueryScorer.getMaxWeight method:
http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/package-summary.html
The QueryScorer.getMaxWeight method is useful when passed to the 
GradientFormatter constructor to define the top score which is associated with 
the top color.
http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/GradientFormatter.html
See QueryScorer.getMaxWeight which can be used to calibrate scoring scale

However, the QueryScorer class does not declare a getMaxWeight method in lucene 
4.1, according to its document:
http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/QueryScorer.html

Instead, the class declares a getMaxTermWeight method. Is that the correct 
method in the preceding two documents? If it is, please revise the two 
documents. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 186 - Failure

2013-02-20 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/186/

1 tests failed.
FAILED:  
org.apache.lucene.classification.SimpleNaiveBayesClassifierTest.testBasicUsage

Error Message:
expected:[74 65 63 68 6e 6f 6c 6f 67 79] but was:[70 6f 6c 69 74 69 63 73]

Stack Trace:
java.lang.AssertionError: expected:[74 65 63 68 6e 6f 6c 6f 67 79] but 
was:[70 6f 6c 69 74 69 63 73]
at 
__randomizedtesting.SeedInfo.seed([D18A2E4C5ACD05CE:8A9997A9D6C55A2E]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:147)
at 
org.apache.lucene.classification.ClassificationTestBase.checkCorrectClassification(ClassificationTestBase.java:68)
at 
org.apache.lucene.classification.SimpleNaiveBayesClassifierTest.testBasicUsage(SimpleNaiveBayesClassifierTest.java:33)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:679)




Build Log:
[...truncated 5832 lines...]
[junit4:junit4] Suite: 
org.apache.lucene.classification.SimpleNaiveBayesClassifierTest
[junit4:junit4]   2 NOTE: download the large 

[jira] [Created] (LUCENE-4788) Out of date code examples

2013-02-20 Thread Hao Zhong (JIRA)
Hao Zhong created LUCENE-4788:
-

 Summary: Out of date code examples
 Key: LUCENE-4788
 URL: https://issues.apache.org/jira/browse/LUCENE-4788
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/facet
Affects Versions: 4.1
Reporter: Hao Zhong
Priority: Critical


The following API documents have code examples:
http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/index/OrdinalMappingAtomicReader.html
http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/index/OrdinalMappingAtomicReader.html
// merge the old taxonomy with the new one.
 OrdinalMap map = DirectoryTaxonomyWriter.addTaxonomies();

The two code examples call the DirectoryTaxonomyWriter.addTaxonomies method. 
Lucene 3.5 has that method, according to its document:
http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/api/all/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyWriter.html

However, lucene 4.1 does not have such a method, according to its document:
http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyWriter.html
Please update the code examples to reflect the latest implementation.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 186 - Failure

2013-02-20 Thread Steve Rowe
Reproduces for me locally with the same seed.

I also saw this in IntelliJ while getting the classification module 
configuration in shape - different seed though: CF99EEAD4D1B8F7E.  This seed 
reproduces the failure for me under Ant.

This test sometimes succeeds under Ant, Maven, and IntelliJ.

Steve

On Feb 20, 2013, at 1:15 PM, Apache Jenkins Server jenk...@builds.apache.org 
wrote:

 Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/186/
 
 1 tests failed.
 FAILED:  
 org.apache.lucene.classification.SimpleNaiveBayesClassifierTest.testBasicUsage
 
 Error Message:
 expected:[74 65 63 68 6e 6f 6c 6f 67 79] but was:[70 6f 6c 69 74 69 63 73]
 
 Stack Trace:
 java.lang.AssertionError: expected:[74 65 63 68 6e 6f 6c 6f 67 79] but 
 was:[70 6f 6c 69 74 69 63 73]
   at 
 __randomizedtesting.SeedInfo.seed([D18A2E4C5ACD05CE:8A9997A9D6C55A2E]:0)
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:147)
   at 
 org.apache.lucene.classification.ClassificationTestBase.checkCorrectClassification(ClassificationTestBase.java:68)
   at 
 org.apache.lucene.classification.SimpleNaiveBayesClassifierTest.testBasicUsage(SimpleNaiveBayesClassifierTest.java:33)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:616)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
   at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
   at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
   at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
   at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
   at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
   at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
   at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
   at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
   at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
   at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
   at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
   at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
   at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
   at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
   at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
   at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   at 
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
   at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
   at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
   at 
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
   at 
 

[jira] [Created] (LUCENE-4789) Typos in API documentation

2013-02-20 Thread Hao Zhong (JIRA)
Hao Zhong created LUCENE-4789:
-

 Summary: Typos in API documentation
 Key: LUCENE-4789
 URL: https://issues.apache.org/jira/browse/LUCENE-4789
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.1
Reporter: Hao Zhong


http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/analysis/package-summary.html
neccessary-necessary 

http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/index/LogMergePolicy.html
exceesd-exceed 

http://lucene.apache.org/core/4_1_0/queryparser/serialized-form.html
http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/classic/ParseException.html
followng-following

http://lucene.apache.org/core/4_1_0/codecs/org/apache/lucene/codecs/bloom/FuzzySet.html
qccuracy-accuracy

http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/search/params/FacetRequest.html
methonds-methods

http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/flexible/standard/parser/CharStream.html
implemetation-implementation

http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/search/TimeLimitingCollector.html
construcutor-constructor 

http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/store/BufferedIndexInput.html
bufer-buffer

http://lucene.apache.org/core/4_1_0/analyzers-kuromoji/org/apache/lucene/analysis/ja/JapaneseIterationMarkCharFilter.html
horizonal-horizontal


http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/taxonomy/writercache/lru/NameHashIntCacheLRU.html
 
cahce-cache

http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/flexible/standard/processors/BooleanQuery2ModifierNodeProcessor.html
precidence-precedence


http://lucene.apache.org/core/4_1_0/analyzers-stempel/org/egothor/stemmer/MultiTrie.html
http://lucene.apache.org/core/4_1_0/analyzers-stempel/org/egothor/stemmer/MultiTrie2.html
commmands-commands

Please revise the documentation. 




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-4789) Typos in API documentation

2013-02-20 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe reassigned LUCENE-4789:
--

Assignee: Steve Rowe

 Typos in API documentation
 --

 Key: LUCENE-4789
 URL: https://issues.apache.org/jira/browse/LUCENE-4789
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.1
Reporter: Hao Zhong
Assignee: Steve Rowe

 http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/analysis/package-summary.html
 neccessary-necessary 
 http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/index/LogMergePolicy.html
 exceesd-exceed 
 http://lucene.apache.org/core/4_1_0/queryparser/serialized-form.html
 http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/classic/ParseException.html
 followng-following
 http://lucene.apache.org/core/4_1_0/codecs/org/apache/lucene/codecs/bloom/FuzzySet.html
 qccuracy-accuracy
 http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/search/params/FacetRequest.html
 methonds-methods
 http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/flexible/standard/parser/CharStream.html
 implemetation-implementation
 http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/search/TimeLimitingCollector.html
 construcutor-constructor 
 http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/store/BufferedIndexInput.html
 bufer-buffer
 http://lucene.apache.org/core/4_1_0/analyzers-kuromoji/org/apache/lucene/analysis/ja/JapaneseIterationMarkCharFilter.html
 horizonal-horizontal
 http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/taxonomy/writercache/lru/NameHashIntCacheLRU.html
  
 cahce-cache
 http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/flexible/standard/processors/BooleanQuery2ModifierNodeProcessor.html
 precidence-precedence
 http://lucene.apache.org/core/4_1_0/analyzers-stempel/org/egothor/stemmer/MultiTrie.html
 http://lucene.apache.org/core/4_1_0/analyzers-stempel/org/egothor/stemmer/MultiTrie2.html
 commmands-commands
 Please revise the documentation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4790) FieldCache.getDocTermOrds back to the future bug

2013-02-20 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-4790:
---

 Summary: FieldCache.getDocTermOrds back to the future bug
 Key: LUCENE-4790
 URL: https://issues.apache.org/jira/browse/LUCENE-4790
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir


Found while working on LUCENE-4765:

FieldCache.getDocTermOrds unsafely bakes in liveDocs into its structure.

This means in cases if you have readers at two points in time (r1, r2), and you 
happen to call getDocTermOrds first on r2, then call it on r1, the results will 
be incorrect.

Simple fix is to make DocTermOrds uninvert take liveDocs explicitly: 
FieldCacheImpl always passes null, Solr's UninvertedField just keeps doing what 
its doing today (since its a top-level reader, and cached somewhere else).

Also DocTermOrds had a telescoping ctor that was uninverting twice. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4790) FieldCache.getDocTermOrds back to the future bug

2013-02-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4790:


Attachment: LUCENE-4790.patch

Here's a test with my proposed fix. Again its to just make the livedocs always 
an explicit parameter so there are no traps or confusion, and FieldCacheImpl 
passes null always.

 FieldCache.getDocTermOrds back to the future bug
 

 Key: LUCENE-4790
 URL: https://issues.apache.org/jira/browse/LUCENE-4790
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-4790.patch


 Found while working on LUCENE-4765:
 FieldCache.getDocTermOrds unsafely bakes in liveDocs into its structure.
 This means in cases if you have readers at two points in time (r1, r2), and 
 you happen to call getDocTermOrds first on r2, then call it on r1, the 
 results will be incorrect.
 Simple fix is to make DocTermOrds uninvert take liveDocs explicitly: 
 FieldCacheImpl always passes null, Solr's UninvertedField just keeps doing 
 what its doing today (since its a top-level reader, and cached somewhere 
 else).
 Also DocTermOrds had a telescoping ctor that was uninverting twice. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4790) FieldCache.getDocTermOrds back to the future bug

2013-02-20 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582436#comment-13582436
 ] 

Michael McCandless commented on LUCENE-4790:


+1

 FieldCache.getDocTermOrds back to the future bug
 

 Key: LUCENE-4790
 URL: https://issues.apache.org/jira/browse/LUCENE-4790
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-4790.patch


 Found while working on LUCENE-4765:
 FieldCache.getDocTermOrds unsafely bakes in liveDocs into its structure.
 This means in cases if you have readers at two points in time (r1, r2), and 
 you happen to call getDocTermOrds first on r2, then call it on r1, the 
 results will be incorrect.
 Simple fix is to make DocTermOrds uninvert take liveDocs explicitly: 
 FieldCacheImpl always passes null, Solr's UninvertedField just keeps doing 
 what its doing today (since its a top-level reader, and cached somewhere 
 else).
 Also DocTermOrds had a telescoping ctor that was uninverting twice. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4571) speedup disjunction with minShouldMatch

2013-02-20 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582453#comment-13582453
 ] 

Michael McCandless commented on LUCENE-4571:


I fixed luceneutil to recognize +minShouldMatch=N, and made a trivial
tasks file:

{noformat}
HighMinShouldMatch4: ref http from name title +minShouldMatch=4
HighMinShouldMatch3: ref http from name title +minShouldMatch=3
HighMinShouldMatch2: ref http from name title +minShouldMatch=2
HighMinShouldMatch0: ref http from name title
Low1MinShouldMatch4: ref http from name dublin +minShouldMatch=4
Low1MinShouldMatch3: ref http from name dublin +minShouldMatch=3
Low1MinShouldMatch2: ref http from name dublin +minShouldMatch=2
Low1MinShouldMatch0: ref http from name dublin
Low2MinShouldMatch4: ref http from wings dublin +minShouldMatch=4
Low2MinShouldMatch3: ref http from wings dublin +minShouldMatch=3
Low2MinShouldMatch2: ref http from wings dublin +minShouldMatch=2
Low2MinShouldMatch0: ref http from wings dublin
Low3MinShouldMatch4: ref http struck wings dublin +minShouldMatch=4
Low3MinShouldMatch3: ref http struck wings dublin +minShouldMatch=3
Low3MinShouldMatch2: ref http struck wings dublin +minShouldMatch=2
Low3MinShouldMatch0: ref http struck wings dublin
Low4MinShouldMatch4: ref restored struck wings dublin +minShouldMatch=4
Low4MinShouldMatch3: ref restored struck wings dublin +minShouldMatch=3
Low4MinShouldMatch2: ref restored struck wings dublin +minShouldMatch=2
Low4MinShouldMatch0: ref restored struck wings dublin
{noformat}

So, each query has 5 terms.  High* means all 5 are high freq, Low1*
means one term is low freq and 4 are high, Low2* means 2 terms are low
freq and 3 are high, etc.

I tested on the 10 M doc wikimedium index, and for both base (= trunk)
and comp (= this patch) I forcefully disabled BS1:

{noformat}
TaskQPS base  StdDevQPS comp  StdDev
Pct diff
 Low3MinShouldMatch23.95  (3.5%)3.00  (2.1%)  
-24.1% ( -28% -  -19%)
 Low1MinShouldMatch21.93  (3.1%)1.50  (2.1%)  
-22.4% ( -26% -  -17%)
 Low2MinShouldMatch22.52  (3.4%)1.96  (2.0%)  
-22.3% ( -26% -  -17%)
 HighMinShouldMatch21.62  (3.2%)1.27  (2.2%)  
-21.3% ( -25% -  -16%)
 HighMinShouldMatch31.65  (3.5%)1.31  (2.3%)  
-20.7% ( -25% -  -15%)
 Low4MinShouldMatch06.91  (3.9%)5.79  (1.6%)  
-16.2% ( -20% -  -11%)
 Low1MinShouldMatch31.98  (3.4%)1.66  (2.3%)  
-15.8% ( -20% -  -10%)
 Low3MinShouldMatch03.69  (3.2%)3.21  (2.1%)  
-13.0% ( -17% -   -8%)
 Low2MinShouldMatch02.38  (3.0%)2.09  (1.9%)  
-12.3% ( -16% -   -7%)
 Low1MinShouldMatch01.84  (2.7%)1.65  (2.2%)  
-10.4% ( -14% -   -5%)
 HighMinShouldMatch01.56  (2.9%)1.41  (2.5%)   
-9.8% ( -14% -   -4%)
 HighMinShouldMatch41.67  (3.6%)1.55  (2.8%)   
-7.1% ( -13% -0%)
 Low2MinShouldMatch32.64  (3.8%)2.65  (2.4%)
0.3% (  -5% -6%)
 Low1MinShouldMatch42.02  (3.5%)2.36  (2.8%)   
16.8% (  10% -   23%)
 Low4MinShouldMatch28.53  (5.3%)   33.74  (5.8%)  
295.8% ( 270% -  324%)
 Low4MinShouldMatch38.56  (5.4%)   44.93  (8.6%)  
424.8% ( 389% -  463%)
 Low3MinShouldMatch34.25  (4.1%)   23.48  (8.8%)  
452.7% ( 422% -  485%)
 Low4MinShouldMatch48.59  (5.2%)   59.53 (11.1%)  
593.3% ( 548% -  643%)
 Low2MinShouldMatch42.68  (3.9%)   21.38 (14.3%)  
696.8% ( 653% -  743%)
 Low3MinShouldMatch44.25  (4.1%)   34.97 (15.4%)  
722.5% ( 675% -  773%)
{noformat}

The new scorer is waaay faster when the minShouldMatch constraint is
highly restrictive, i.e. when .advance is being used on only low-freq
terms (I think?).  It a bit slower for the no-minShouldMatch case
(*MinShouldMatch0).  When .advance is sometimes used on the high freq
terms it's a bit slower than BS2 today.

I ran a 2nd test, this time with BS1 as the baseline.  BS1 is faster
than BS2, but indeed it still evaluates all subs and only rules out
minShouldMmatch in the end.  I had to turn off luceneutil's score
comparisons since BS1/BS2 produce different scores:

{noformat}
TaskQPS base  StdDevQPS comp  StdDev
Pct diff
 HighMinShouldMatch23.33  (8.8%)1.30  (0.8%)  
-60.9% ( -64% -  -56%)
 HighMinShouldMatch33.35  (8.8%)1.33  (1.0%)  
-60.5% ( -64% -  -55%)
 Low1MinShouldMatch23.79  (8.4%)1.52  (0.9%)  
-59.9% ( -63% -  -55%)
 HighMinShouldMatch0

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.6.0_38) - Build # 4349 - Failure!

2013-02-20 Thread Michael McCandless
On Wed, Feb 20, 2013 at 8:15 AM, Robert Muir rcm...@gmail.com wrote:
 I'm not sure i really fixed it!

 I fixed IWC to use this mergescheduler and for the test to not be so
 slow, but i noticed the value it always got for totalBytesSize is 0...

That's not right!

I'll dig.

Mike McCandless

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_38) - Build # 4377 - Still Failing!

2013-02-20 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/4377/
Java: 32bit/jdk1.6.0_38 -server -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 29113 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:381: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:320: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:120: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* dev-tools/maven/lucene/classification/pom.xml.template

Total time: 54 minutes 23 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.6.0_38 -server -XX:+UseConcMarkSweepGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4783) Inconsistent results, changing based on recent previous searches (caching?)

2013-02-20 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582470#comment-13582470
 ] 

Michael McCandless commented on LUCENE-4783:


Can you post a test case showing the issue?


 Inconsistent results, changing based on recent previous searches (caching?)
 ---

 Key: LUCENE-4783
 URL: https://issues.apache.org/jira/browse/LUCENE-4783
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.1
 Environment: Ubuntu Linux  Java application running under Tomcat
Reporter: William Johnson

 We have several repeatable cases where Lucene is returning different 
 candidates for the same search, on the same (static) index, depending on what 
 other searches have been run before hand.
 It appears as though Lucene is failing to find matches in some cases if they 
 have not been cached by a previous search.
 In specific (although it is happening with more than just fuzzy searches), a 
 fuzzy search on a misspelled street name returns no result.  If you then 
 search on the correctly spelled street name, and THEN return to the original 
 fuzzy query on the original incorrect spelling, you now receive the result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1448369 - /lucene/dev/branches/branch_4x/dev-tools/maven/lucene/classification/pom.xml.template

2013-02-20 Thread Steve Rowe
Thanks Robert!

On Feb 20, 2013, at 2:48 PM, rm...@apache.org wrote:

 Author: rmuir
 Date: Wed Feb 20 19:48:39 2013
 New Revision: 1448369
 
 URL: http://svn.apache.org/r1448369
 Log:
 add eol-style
 
 Modified:

 lucene/dev/branches/branch_4x/dev-tools/maven/lucene/classification/pom.xml.template
(props changed)
 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4479) TermVectorComponent NPE when running Solr Cloud

2013-02-20 Thread Vitali Kviatkouski (JIRA)
Vitali Kviatkouski created SOLR-4479:


 Summary: TermVectorComponent NPE when running Solr Cloud
 Key: SOLR-4479
 URL: https://issues.apache.org/jira/browse/SOLR-4479
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.1
Reporter: Vitali Kviatkouski


When running Solr Cloud (just simply 2 shards - as described in wiki), got NPE
java.lang.NullPointerException
at 
org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:365)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:722)


To reproduce, follow the guide in wiki (http://wiki.apache.org/solr/SolrCloud), 
add some documents and then request 
http://localhost:8983/solr/collection1/tvrh?q=*%3A*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4790) FieldCache.getDocTermOrds back to the future bug

2013-02-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-4790.
-

   Resolution: Fixed
Fix Version/s: 5.0
   4.2

 FieldCache.getDocTermOrds back to the future bug
 

 Key: LUCENE-4790
 URL: https://issues.apache.org/jira/browse/LUCENE-4790
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4790.patch


 Found while working on LUCENE-4765:
 FieldCache.getDocTermOrds unsafely bakes in liveDocs into its structure.
 This means in cases if you have readers at two points in time (r1, r2), and 
 you happen to call getDocTermOrds first on r2, then call it on r1, the 
 results will be incorrect.
 Simple fix is to make DocTermOrds uninvert take liveDocs explicitly: 
 FieldCacheImpl always passes null, Solr's UninvertedField just keeps doing 
 what its doing today (since its a top-level reader, and cached somewhere 
 else).
 Also DocTermOrds had a telescoping ctor that was uninverting twice. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4479) TermVectorComponent NPE when running Solr Cloud

2013-02-20 Thread Vitali Kviatkouski (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitali Kviatkouski updated SOLR-4479:
-

Description: 
When running Solr Cloud (just simply 2 shards - as described in wiki), got NPE
java.lang.NullPointerException
at 
org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
. Skipped

To reproduce, follow the guide in wiki (http://wiki.apache.org/solr/SolrCloud), 
add some documents and then request 
http://localhost:8983/solr/collection1/tvrh?q=*%3A*

If I include term search vector component in search handler, I get (on second 
shard):
SEVERE: null:java.lang.NullPointerException
at 
org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:321)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:206)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)


  was:
When running Solr Cloud (just simply 2 shards - as described in wiki), got NPE
java.lang.NullPointerException
at 
org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:365)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
at 

[jira] [Updated] (SOLR-4479) TermVectorComponent NPE when running Solr Cloud

2013-02-20 Thread Vitali Kviatkouski (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitali Kviatkouski updated SOLR-4479:
-

Description: 
When running Solr Cloud (just simply 2 shards - as described in wiki), got NPE
java.lang.NullPointerException
at 
org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
. Skipped

To reproduce, follow the guide in wiki (http://wiki.apache.org/solr/SolrCloud), 
add some documents and then request 
http://localhost:8983/solr/collection1/tvrh?q=*%3A*

If I include term search vector component in search handler, I get (on second 
shard):
SEVERE: null:java.lang.NullPointerException
at 
org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:321)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:206)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)

Also for our project needs I rewrote TermVectorComponent and the NPE above gone 
away, but new appeared:
java.lang.NullPointerException
at 
org.apache.solr.common.util.NamedList.nameValueMapToList(NamedList.java:109)
at org.apache.solr.common.util.NamedList.init(NamedList.java:75)
at 
org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:452)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)


  was:
When running Solr Cloud (just simply 2 shards - as described in wiki), got NPE
java.lang.NullPointerException
at 
org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
at 

[JENKINS] Lucene-Solr-Tests-4.x-Java6 - Build # 1357 - Failure

2013-02-20 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java6/1357/

1 tests failed.
REGRESSION:  
org.apache.lucene.classification.SimpleNaiveBayesClassifierTest.testBasicUsage

Error Message:
expected:[74 65 63 68 6e 6f 6c 6f 67 79] but was:[70 6f 6c 69 74 69 63 73]

Stack Trace:
java.lang.AssertionError: expected:[74 65 63 68 6e 6f 6c 6f 67 79] but 
was:[70 6f 6c 69 74 69 63 73]
at 
__randomizedtesting.SeedInfo.seed([73E949696A7AEC3:5C2D2D731AAFF123]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:147)
at 
org.apache.lucene.classification.ClassificationTestBase.checkCorrectClassification(ClassificationTestBase.java:68)
at 
org.apache.lucene.classification.SimpleNaiveBayesClassifierTest.testBasicUsage(SimpleNaiveBayesClassifierTest.java:33)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:679)




Build Log:
[...truncated 5697 lines...]
[junit4:junit4] Suite: 
org.apache.lucene.classification.SimpleNaiveBayesClassifierTest
[junit4:junit4]   2 NOTE: reproduce with: 

[jira] [Updated] (SOLR-4465) Configurable Collectors

2013-02-20 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4465:
-

Attachment: SOLR-4465.patch

First patch which adds the code to read the collectorFactory element from the 
solrconfig.xml. This will be iterated to add more detail.

 Configurable Collectors
 ---

 Key: SOLR-4465
 URL: https://issues.apache.org/jira/browse/SOLR-4465
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.1
Reporter: Joel Bernstein
 Fix For: 4.2, 5.0

 Attachments: SOLR-4465.patch


 This issue is to add configurable custom collectors to Solr. This expands the 
 design and work done in issue SOLR-1680 to include:
 1) CollectorFactory configuration in solconfig.xml
 2) Http parameters to allow clients to dynamically select a CollectorFactory 
 and construct a custom Collector.
 3) Make aspects of QueryComponent pluggable so that the output from 
 distributed search can conform with custom collectors at the shard level.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4789) Typos in API documentation

2013-02-20 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved LUCENE-4789.


Resolution: Fixed

Committed fixes for some of these (and some more I noticed along the way) to 
trunk and branch_4x.

Thanks Hao!

{quote}
http://lucehttp://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/classic/ParseException.html
[...]
http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/flexible/standard/parser/CharStream.html
{quote}

JavaCC generated these ParseException and CharStream files (and several others 
in the project) - I'm not going to change them.

 Typos in API documentation
 --

 Key: LUCENE-4789
 URL: https://issues.apache.org/jira/browse/LUCENE-4789
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.1
Reporter: Hao Zhong
Assignee: Steve Rowe

 http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/analysis/package-summary.html
 neccessary-necessary 
 http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/index/LogMergePolicy.html
 exceesd-exceed 
 http://lucene.apache.org/core/4_1_0/queryparser/serialized-form.html
 http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/classic/ParseException.html
 followng-following
 http://lucene.apache.org/core/4_1_0/codecs/org/apache/lucene/codecs/bloom/FuzzySet.html
 qccuracy-accuracy
 http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/search/params/FacetRequest.html
 methonds-methods
 http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/flexible/standard/parser/CharStream.html
 implemetation-implementation
 http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/search/TimeLimitingCollector.html
 construcutor-constructor 
 http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/store/BufferedIndexInput.html
 bufer-buffer
 http://lucene.apache.org/core/4_1_0/analyzers-kuromoji/org/apache/lucene/analysis/ja/JapaneseIterationMarkCharFilter.html
 horizonal-horizontal
 http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/taxonomy/writercache/lru/NameHashIntCacheLRU.html
  
 cahce-cache
 http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/flexible/standard/processors/BooleanQuery2ModifierNodeProcessor.html
 precidence-precedence
 http://lucene.apache.org/core/4_1_0/analyzers-stempel/org/egothor/stemmer/MultiTrie.html
 http://lucene.apache.org/core/4_1_0/analyzers-stempel/org/egothor/stemmer/MultiTrie2.html
 commmands-commands
 Please revise the documentation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4789) Typos in API documentation

2013-02-20 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-4789:
---

Fix Version/s: 5.0
   4.2

 Typos in API documentation
 --

 Key: LUCENE-4789
 URL: https://issues.apache.org/jira/browse/LUCENE-4789
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.1
Reporter: Hao Zhong
Assignee: Steve Rowe
 Fix For: 4.2, 5.0


 http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/analysis/package-summary.html
 neccessary-necessary 
 http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/index/LogMergePolicy.html
 exceesd-exceed 
 http://lucene.apache.org/core/4_1_0/queryparser/serialized-form.html
 http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/classic/ParseException.html
 followng-following
 http://lucene.apache.org/core/4_1_0/codecs/org/apache/lucene/codecs/bloom/FuzzySet.html
 qccuracy-accuracy
 http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/search/params/FacetRequest.html
 methonds-methods
 http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/flexible/standard/parser/CharStream.html
 implemetation-implementation
 http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/search/TimeLimitingCollector.html
 construcutor-constructor 
 http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/store/BufferedIndexInput.html
 bufer-buffer
 http://lucene.apache.org/core/4_1_0/analyzers-kuromoji/org/apache/lucene/analysis/ja/JapaneseIterationMarkCharFilter.html
 horizonal-horizontal
 http://lucene.apache.org/core/4_1_0/facet/org/apache/lucene/facet/taxonomy/writercache/lru/NameHashIntCacheLRU.html
  
 cahce-cache
 http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/flexible/standard/processors/BooleanQuery2ModifierNodeProcessor.html
 precidence-precedence
 http://lucene.apache.org/core/4_1_0/analyzers-stempel/org/egothor/stemmer/MultiTrie.html
 http://lucene.apache.org/core/4_1_0/analyzers-stempel/org/egothor/stemmer/MultiTrie2.html
 commmands-commands
 Please revise the documentation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.6.0) - Build # 233 - Failure!

2013-02-20 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/233/
Java: 64bit/jdk1.6.0 -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 29195 lines...]
BUILD FAILED
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-4.x-MacOSX/build.xml:381: 
The following error occurred while executing this line:
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-4.x-MacOSX/build.xml:320: 
The following error occurred while executing this line:
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-4.x-MacOSX/extra-targets.xml:120:
 The following files are missing svn:eol-style (or binary svn:mime-type):
* dev-tools/maven/lucene/classification/pom.xml.template

Total time: 89 minutes 31 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 64bit/jdk1.6.0 -XX:+UseConcMarkSweepGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4465) Configurable Collectors

2013-02-20 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4465:
-

Attachment: SOLR-4465.patch

Added CollectorFactory.java to patch

 Configurable Collectors
 ---

 Key: SOLR-4465
 URL: https://issues.apache.org/jira/browse/SOLR-4465
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.1
Reporter: Joel Bernstein
 Fix For: 4.2, 5.0

 Attachments: SOLR-4465.patch, SOLR-4465.patch


 This issue is to add configurable custom collectors to Solr. This expands the 
 design and work done in issue SOLR-1680 to include:
 1) CollectorFactory configuration in solconfig.xml
 2) Http parameters to allow clients to dynamically select a CollectorFactory 
 and construct a custom Collector.
 3) Make aspects of QueryComponent pluggable so that the output from 
 distributed search can conform with custom collectors at the shard level.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (SOLR-669) SOLR currently does not support caching for (Query, FacetFieldList)

2013-02-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reopened SOLR-669:
--


 SOLR currently does not support caching for (Query, FacetFieldList)
 ---

 Key: SOLR-669
 URL: https://issues.apache.org/jira/browse/SOLR-669
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.3
Reporter: Fuad Efendi
   Original Estimate: 1,680h
  Remaining Estimate: 1,680h

 It is huge performance bottleneck and it describes huge difference between 
 qtime and SolrJ's elapsedTime. I quickly browsed SolrIndexSearcher: it caches 
 only (Key, DocSet/DocList Lucene Ids) key-value pairs and it does not have 
 cache for (Query, FacetFieldList).
 filterCache stores DocList for each 'filter' and is used for constant 
 recalculations...
 This would be significant performance improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-669) SOLR currently does not support caching for (Query, FacetFieldList)

2013-02-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl closed SOLR-669.


Resolution: Won't Fix

Changed resolution state to Won't fix. It appears this is not a feature 
anyone finds useful enough to even comment on, far less contribute to for 
almost 5 years, so to me that's a theoretical need, not a real one. Please 
re-open if you (or anyone else) want to see this solved.

 SOLR currently does not support caching for (Query, FacetFieldList)
 ---

 Key: SOLR-669
 URL: https://issues.apache.org/jira/browse/SOLR-669
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.3
Reporter: Fuad Efendi
   Original Estimate: 1,680h
  Remaining Estimate: 1,680h

 It is huge performance bottleneck and it describes huge difference between 
 qtime and SolrJ's elapsedTime. I quickly browsed SolrIndexSearcher: it caches 
 only (Key, DocSet/DocList Lucene Ids) key-value pairs and it does not have 
 cache for (Query, FacetFieldList).
 filterCache stores DocList for each 'filter' and is used for constant 
 recalculations...
 This would be significant performance improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4465) Configurable Collectors

2013-02-20 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-4465:
-

Attachment: SOLR-4465.patch

Added CollectorParams.java to hold the http collector parameters. Using the 
prefix cl collector parameters.

 Configurable Collectors
 ---

 Key: SOLR-4465
 URL: https://issues.apache.org/jira/browse/SOLR-4465
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.1
Reporter: Joel Bernstein
 Fix For: 4.2, 5.0

 Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch


 This issue is to add configurable custom collectors to Solr. This expands the 
 design and work done in issue SOLR-1680 to include:
 1) CollectorFactory configuration in solconfig.xml
 2) Http parameters to allow clients to dynamically select a CollectorFactory 
 and construct a custom Collector.
 3) Make aspects of QueryComponent pluggable so that the output from 
 distributed search can conform with custom collectors at the shard level.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4479) TermVectorComponent NPE when running Solr Cloud

2013-02-20 Thread Vitali Kviatkouski (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitali Kviatkouski updated SOLR-4479:
-

Description: 
When running Solr Cloud (just simply 2 shards - as described in wiki), got NPE
java.lang.NullPointerException
at 
org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
. Skipped

To reproduce, follow the guide in wiki (http://wiki.apache.org/solr/SolrCloud), 
add some documents and then request 
http://localhost:8983/solr/collection1/tvrh?q=*%3A*

If I include term search vector component in search handler, I get (on second 
shard):
SEVERE: null:java.lang.NullPointerException
at 
org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:321)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:206)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)



  was:
When running Solr Cloud (just simply 2 shards - as described in wiki), got NPE
java.lang.NullPointerException
at 
org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
. Skipped

To reproduce, follow the guide in wiki (http://wiki.apache.org/solr/SolrCloud), 
add some documents and then request 
http://localhost:8983/solr/collection1/tvrh?q=*%3A*

If I include term search vector component in search handler, I get (on second 
shard):
SEVERE: null:java.lang.NullPointerException
at 
org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:321)
at 

[jira] [Commented] (SOLR-4414) MoreLikeThis on a shard finds no interesting terms if the document queried is not in that shard

2013-02-20 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582635#comment-13582635
 ] 

Shawn Heisey commented on SOLR-4414:


Colin, are you able to make distributed MLT work?  I can't make it work at all. 
 Do my problems require a separate issue?


 MoreLikeThis on a shard finds no interesting terms if the document queried is 
 not in that shard
 ---

 Key: SOLR-4414
 URL: https://issues.apache.org/jira/browse/SOLR-4414
 Project: Solr
  Issue Type: Bug
  Components: MoreLikeThis, SolrCloud
Affects Versions: 4.1
Reporter: Colin Bartolome

 Running a MoreLikeThis query in a cloud works only when the document being 
 queried exists in whatever shard serves the request. If the document is not 
 present in the shard, no interesting terms are found and, consequently, no 
 matches are found.
 h5. Steps to reproduce
 * Edit example/solr/collection1/conf/solrconfig.xml and add this line, with 
 the rest of the request handlers:
 {code:xml}
 requestHandler name=/mlt class=solr.MoreLikeThisHandler /
 {code}
 * Follow the [simplest SolrCloud 
 example|http://wiki.apache.org/solr/SolrCloud#Example_A:_Simple_two_shard_cluster]
  to get two shards running.
 * Hit this URL: 
 [http://localhost:8983/solr/collection1/mlt?mlt.fl=includesq=id:3007WFPmlt.match.include=falsemlt.interestingTerms=listmlt.mindf=1mlt.mintf=1]
 * Compare that output to that of this URL: 
 [http://localhost:7574/solr/collection1/mlt?mlt.fl=includesq=id:3007WFPmlt.match.include=falsemlt.interestingTerms=listmlt.mindf=1mlt.mintf=1]
 The former URL will return a result and list some interesting terms. The 
 latter URL will return no results and list no interesting terms. It will also 
 show this odd XML element:
 {code:xml}
 null name=response/
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Artifacts-4.x - Build # 234 - Failure

2013-02-20 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Artifacts-4.x/234/

No tests ran.

Build Log:
[...truncated 11360 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-4.x/lucene/build.xml:510:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-4.x/lucene/common-build.xml:1745:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-4.x/lucene/common-build.xml:1368:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-4.x/lucene/common-build.xml:500:
 Unable to initialize POM pom.xml: Could not find the model file 
'/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-4.x/lucene/build/poms/lucene/classification/src/java/pom.xml'.
 for project unknown

Total time: 7 minutes 43 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Publishing Javadoc
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4414) MoreLikeThis on a shard finds no interesting terms if the document queried is not in that shard

2013-02-20 Thread Colin Bartolome (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582673#comment-13582673
 ] 

Colin Bartolome commented on SOLR-4414:
---

Using the {{MoreLikeThisHandler}}, following the steps to reproduce I wrote 
produces interesting terms on one server, but not the other. On the server that 
produces interesting terms, the MLT search _is_ performed, but it returns 
matching documents from that server only.

I don't know enough about broker cores to say for sure whether your issue is 
related.

 MoreLikeThis on a shard finds no interesting terms if the document queried is 
 not in that shard
 ---

 Key: SOLR-4414
 URL: https://issues.apache.org/jira/browse/SOLR-4414
 Project: Solr
  Issue Type: Bug
  Components: MoreLikeThis, SolrCloud
Affects Versions: 4.1
Reporter: Colin Bartolome

 Running a MoreLikeThis query in a cloud works only when the document being 
 queried exists in whatever shard serves the request. If the document is not 
 present in the shard, no interesting terms are found and, consequently, no 
 matches are found.
 h5. Steps to reproduce
 * Edit example/solr/collection1/conf/solrconfig.xml and add this line, with 
 the rest of the request handlers:
 {code:xml}
 requestHandler name=/mlt class=solr.MoreLikeThisHandler /
 {code}
 * Follow the [simplest SolrCloud 
 example|http://wiki.apache.org/solr/SolrCloud#Example_A:_Simple_two_shard_cluster]
  to get two shards running.
 * Hit this URL: 
 [http://localhost:8983/solr/collection1/mlt?mlt.fl=includesq=id:3007WFPmlt.match.include=falsemlt.interestingTerms=listmlt.mindf=1mlt.mintf=1]
 * Compare that output to that of this URL: 
 [http://localhost:7574/solr/collection1/mlt?mlt.fl=includesq=id:3007WFPmlt.match.include=falsemlt.interestingTerms=listmlt.mindf=1mlt.mintf=1]
 The former URL will return a result and list some interesting terms. The 
 latter URL will return no results and list no interesting terms. It will also 
 show this odd XML element:
 {code:xml}
 null name=response/
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4480) EDisMax parser blows up with query containing single plus or minus

2013-02-20 Thread Fiona Tay (JIRA)
Fiona Tay created SOLR-4480:
---

 Summary: EDisMax parser blows up with query containing single plus 
or minus
 Key: SOLR-4480
 URL: https://issues.apache.org/jira/browse/SOLR-4480
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Reporter: Fiona Tay
Priority: Minor


We are running solr with sunspot and when we set up a query containing a single 
plus, Solr blows up with the following error:
SOLR Request (5.0ms)  [ path=#RSolr::Client:0x4c7464ac parameters={data: 
fq=type%3A%28Attachment+OR+User+OR+GpdbDataSource+OR+HadoopInstance+OR+GnipInstance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=type_name_s%3A%28Attachment+OR+User+OR+Instance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29q=%2Bfl=%2A+scoreqf=name_texts+first_name_texts+last_name_texts+file_name_textsdefType=edismaxhl=onhl.simple.pre=%40%40%40hl%40%40%40hl.simple.post=%40%40%40endhl%40%40%40start=0rows=3,
 method: post, params: {:wt=:ruby}, query: wt=ruby, headers: 
{Content-Type=application/x-www-form-urlencoded; charset=UTF-8}, path: 
select, uri: http://localhost:8982/solr/select?wt=ruby, open_timeout: , 
read_timeout: } ]

RSolr::Error::Http (RSolr::Error::Http - 400 Bad Request
Error: org.apache.lucene.queryParser.ParseException: Cannot parse '': 
Encountered EOF at line 1, column 0.
Was expecting one of:
NOT ...
+ ...
- ...
( ...
* ...
QUOTED ...
TERM ...
PREFIXTERM ...
WILDTERM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Artifacts-4.x - Build # 234 - Failure

2013-02-20 Thread Steve Rowe
This failed because the lucene/classification/build.xml improperly specialized 
the dist-maven target.

I committed a fix in r1448473, on branch_4x only, since Tommaso already fixed 
the problem on trunk earlier today.

Steve

On Feb 20, 2013, at 6:08 PM, Apache Jenkins Server jenk...@builds.apache.org 
wrote:

 Build: https://builds.apache.org/job/Lucene-Artifacts-4.x/234/
 
 No tests ran.
 
 Build Log:
 [...truncated 11360 lines...]
 BUILD FAILED
 /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-4.x/lucene/build.xml:510:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-4.x/lucene/common-build.xml:1745:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-4.x/lucene/common-build.xml:1368:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-4.x/lucene/common-build.xml:500:
  Unable to initialize POM pom.xml: Could not find the model file 
 '/usr/home/hudson/hudson-slave/workspace/Lucene-Artifacts-4.x/lucene/build/poms/lucene/classification/src/java/pom.xml'.
  for project unknown
 
 Total time: 7 minutes 43 seconds
 Build step 'Invoke Ant' marked build as failure
 Archiving artifacts
 Publishing Javadoc
 Email was triggered for: Failure
 Sending email for trigger: Failure
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Commit Tag Bot: MIA

2013-02-20 Thread Steve Rowe
I haven't seen any activity from the Commit Tag Bot for about 72 hours.

Mark, is there something wrong with it?

Steve

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_38) - Build # 4380 - Failure!

2013-02-20 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/4380/
Java: 32bit/jdk1.6.0_38 -server -XX:+UseParallelGC

1 tests failed.
REGRESSION:  
org.apache.lucene.classification.SimpleNaiveBayesClassifierTest.testBasicUsage

Error Message:
expected:[74 65 63 68 6e 6f 6c 6f 67 79] but was:[70 6f 6c 69 74 69 63 73]

Stack Trace:
java.lang.AssertionError: expected:[74 65 63 68 6e 6f 6c 6f 67 79] but 
was:[70 6f 6c 69 74 69 63 73]
at 
__randomizedtesting.SeedInfo.seed([778B15F1523F921F:2C98AC14DE37CDFF]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:147)
at 
org.apache.lucene.classification.ClassificationTestBase.checkCorrectClassification(ClassificationTestBase.java:68)
at 
org.apache.lucene.classification.SimpleNaiveBayesClassifierTest.testBasicUsage(SimpleNaiveBayesClassifierTest.java:33)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:662)




Build Log:
[...truncated 5665 lines...]
[junit4:junit4] Suite: 

[jira] [Commented] (SOLR-3191) field exclusion from fl

2013-02-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582700#comment-13582700
 ] 

Jan Høydahl commented on SOLR-3191:
---

[~lucacavanna], now that SOLR-2719 is fixed, I think it should be green lights 
for this, if you'd like to attempt a patch. I don't know if the code from 
[UserFields 
class|https://github.com/apache/lucene-solr/blob/eca4d7b44e43e84add4b37cb9b4dde910f58e7c7/solr/core/src/java/org/apache/solr/search/ExtendedDismaxQParser.java#L1276]
 may be helpful at all..

 field exclusion from fl
 ---

 Key: SOLR-3191
 URL: https://issues.apache.org/jira/browse/SOLR-3191
 Project: Solr
  Issue Type: Improvement
Reporter: Luca Cavanna
Priority: Minor

 I think it would be useful to add a way to exclude field from the Solr 
 response. If I have for example 100 stored fields and I want to return all of 
 them but one, it would be handy to list just the field I want to exclude 
 instead of the 99 fields for inclusion through fl.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4450) Developer Curb Appeal: Need consistent command line arguments for all nodes

2013-02-20 Thread Mark Bennett (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582710#comment-13582710
 ] 

Mark Bennett commented on SOLR-4450:


The idea I was thinking of was that we'd come up in multicast by default, BUT 
also with a named config.

So I could startup 4 instances with -configName MarksLab

Then you start yours up with -configName ShawnsLab

And even though we're using multicast on the same network segment, we don't 
accidentally collide with each other.

 Developer Curb Appeal: Need consistent command line arguments for all nodes
 ---

 Key: SOLR-4450
 URL: https://issues.apache.org/jira/browse/SOLR-4450
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.1
Reporter: Mark Bennett
 Fix For: 4.2


 Suppose you want to create a small 4 node cluster (2x2, two shards, each 
 replicated), each on it's own machine.
 It'd be nice to use the same script in /etc/init.d to start them all, but 
 it's hard to come up with a set of arguments that works for both the first 
 and subsequent nodes.
 When MANUALLY starting them, the arguments for the first node are different 
 than for subsequent nodes:
 Node A like this:
 -DzkRun -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig -jar start.jar
 Vs. the other 3 nodes, B, C, D:
   -DzkHost=nodeA:9983 -jar start.jar
 But if you combine them, you either still have to rely on Node A being up 
 first, and have all nodes reference it:
 -DzkRun -DzkHost=nodeA:9983 -DnumShards=2 
 -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=MyConfig
 OR you can try to specify the address of all 4 machines, in all 4 startup 
 scripts, which seems logical but doesn't work:
 -DzkRun -DzkHost=nodeA:9983,nodeB:9983,nodeC:9983,nodeD:9983 
 -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig
 This gives an error:
 org.apache.solr.common.SolrException log
 SEVERE: null:java.lang.IllegalArgumentException: port out of range:-1
 This thread suggests a possible change in syntax, but doesn't seem to work 
 (at least with the embedded ZooKeeper)
 Thread:
 http://lucene.472066.n3.nabble.com/solr4-0-problem-zkHost-with-multiple-hosts-throws-out-of-range-exception-td4014440.html
 Syntax:
 -DzkRun -DzkHost=nodeA:9983,nodeB:9983,nodeC:9983,nodeD:9983/solrroot 
 -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig
 Error:
 SEVERE: Could not start Solr. Check solr/home property and the logs
 Feb 12, 2013 1:36:49 PM org.apache.solr.common.SolrException log
 SEVERE: null:java.lang.NumberFormatException: For input string: 
 9983/solrroot
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
 So:
 * There needs to be some syntax that all nodes can run, even if it requires 
 listing addresses  (or multicast!)
 * And then clear documentation about suggesting external ZooKeeper to be used 
 for production (list being maintained in SOLR-)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4470) Support for basic http auth in internal solr requests

2013-02-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582714#comment-13582714
 ] 

Jan Høydahl commented on SOLR-4470:
---

{quote}
1) Make Solr URL structure right - e.g. /solr/update/collection1
2) Make obvious security constraints like protecting update or protecting 
search etc. impossible to be done by web.xml configuration, and leave it up to 
programmatic protection
{quote}
I think 1) is a no-starter, because someone may have another usecase, namely 
assigning collections to different customers and thus collection is more 
important than action. It all boils down to that you should trust those that 
you authenticate to access your server enough for them not to mock around 
deleting indices or something. If you need like crazy detailed authorization 
scheme, then put a programmable proxy in front of Solr, like Varnish or 
something!

This issue should be about BASIC auth and perhaps certificate based auth, with 
the intention of blocking out people or machines that should not have access to 
search at all, versus those that should. Then it would be a completely 
different beast of a JIRA to add detailed authorization support.

 Support for basic http auth in internal solr requests
 -

 Key: SOLR-4470
 URL: https://issues.apache.org/jira/browse/SOLR-4470
 Project: Solr
  Issue Type: Bug
  Components: clients - java, multicore, replication (java), SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
  Labels: authentication, solrclient, solrcloud
 Fix For: 4.2


 We want to protect any HTTP-resource (url). We want to require credentials no 
 matter what kind of HTTP-request you make to a Solr-node.
 It can faily easy be acheived as described on 
 http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
 also make internal request to other Solr-nodes, and for it to work 
 credentials need to be provided here also.
 Ideally we would like to forward credentials from a particular request to 
 all the internal sub-requests it triggers. E.g. for search and update 
 request.
 But there are also internal requests
 * that only indirectly/asynchronously triggered from outside requests (e.g. 
 shard creation/deletion/etc based on calls to the Collection API)
 * that do not in any way have relation to an outside super-request (e.g. 
 replica synching stuff)
 We would like to aim at a solution where original credentials are 
 forwarded when a request directly/synchronously trigger a subrequest, and 
 fallback to a configured internal credentials for the 
 asynchronous/non-rooted requests.
 In our solution we would aim at only supporting basic http auth, but we would 
 like to make a framework around it, so that not to much refactoring is 
 needed if you later want to make support for other kinds of auth (e.g. digest)
 We will work at a solution but create this JIRA issue early in order to get 
 input/comments from the community as early as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4432) Developer Curb Appeal: Eliminate the need to run Solr example once in order to unpack needed files

2013-02-20 Thread Mark Bennett (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582717#comment-13582717
 ] 

Mark Bennett commented on SOLR-4432:


Hi Mark,

Although I agree with your comment, it's yet another extra manual step to get 
wrong, and that has to be consistent done on all 4 machines.

If this were the only issue, maybe it's minor, but all those stupid little 
commands to remember all add up, especially when you're new.  Solr has a lot of 
those fiddly little things that more modern engines take care of automatically.

And if we know we need it, then why not just do it automatically?

 Developer Curb Appeal: Eliminate the need to run Solr example once in order 
 to unpack needed files
 --

 Key: SOLR-4432
 URL: https://issues.apache.org/jira/browse/SOLR-4432
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.1
Reporter: Mark Bennett
 Fix For: 4.2


 In the SolrCloud instructions it says you must run the solr in the example 
 directory at least once in order to unpack some files, in order to then use 
 the example directory as a template for shards.
 Ideally we would unpack whatever we need, or do this automatically.
 Doc reference:
 http://lucidworks.lucidimagination.com/display/solr/Getting+Started+with+SolrCloud
 See the red box that says:
 Make sure to run Solr from the example directory in non-SolrCloud mode at 
 least once before beginning; this process unpacks the jar files necessary to 
 run SolrCloud. On the other hand, make sure also that there are no documents 
 in the example directory before making copies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4434) Developer Curb Appeal: Better options than the manual copy step, and doc changes

2013-02-20 Thread Mark Bennett (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582718#comment-13582718
 ] 

Mark Bennett commented on SOLR-4434:


Understood, exammpleN vs. shardN, but still using some ordinal set of 
directories.  That really only makes sense if you're trying to run multiple 
nodes on a single laptop.

I don't fully understand the distribution of labor between the wiki and Lucid's 
search hub.  Not sure who keeps them in sync.

 Developer Curb Appeal: Better options than the manual copy step, and doc 
 changes
 

 Key: SOLR-4434
 URL: https://issues.apache.org/jira/browse/SOLR-4434
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.1
Reporter: Mark Bennett
 Fix For: 4.2


 We make developers manually copy the example directory to a named shard 
 directory.
 Doc references:
 http://lucidworks.lucidimagination.com/display/solr/Getting+Started+with+SolrCloud
 http://wiki.apache.org/solr/SolrCloud
 Sample commands:
 cp -r example shard1
 cp -r example shard2
 The doc is perhaps geared towards a developer laptop, so in that case you 
 really would need to make sure they have different names.
 But if you're running on a more realistic multi-node system, let's say 4 
 nodes handling 2 shards, the the actual shard allocation (shard1 vs. shard2) 
 will be fixed by the order each node is started in FOR THE FIRST TIME.
 At a minimum, we should do a better job of explaining the somewhat arbitrary 
 nature of the destination directories, and that the start order is what 
 really matters.
 We should also document that the actual shard assignment will not change, 
 regardless of the name, and where this information is persisted?
 Could we have an intelligent guess as to what template directory to use, and 
 do the copy when the node is first started.
 It's apparently also possible to startup the first Solr node with no cores 
 and just point it at a template.  This would be good to document.  There's 
 currently a bug in the Web UI if you do this, but I'll be logging another 
 JIRA for that.
 When combined with all the other little details of bringing up Solr Cloud 
 nodes, this is confusing to a newcomer and midly annoying.  Other engines 
 don't require this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4480) EDisMax parser blows up with query containing single plus or minus

2013-02-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-4480:
--

Priority: Critical  (was: Minor)

 EDisMax parser blows up with query containing single plus or minus
 --

 Key: SOLR-4480
 URL: https://issues.apache.org/jira/browse/SOLR-4480
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Reporter: Fiona Tay
Priority: Critical
 Fix For: 4.2


 We are running solr with sunspot and when we set up a query containing a 
 single plus, Solr blows up with the following error:
 SOLR Request (5.0ms)  [ path=#RSolr::Client:0x4c7464ac parameters={data: 
 fq=type%3A%28Attachment+OR+User+OR+GpdbDataSource+OR+HadoopInstance+OR+GnipInstance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=type_name_s%3A%28Attachment+OR+User+OR+Instance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29q=%2Bfl=%2A+scoreqf=name_texts+first_name_texts+last_name_texts+file_name_textsdefType=edismaxhl=onhl.simple.pre=%40%40%40hl%40%40%40hl.simple.post=%40%40%40endhl%40%40%40start=0rows=3,
  method: post, params: {:wt=:ruby}, query: wt=ruby, headers: 
 {Content-Type=application/x-www-form-urlencoded; charset=UTF-8}, path: 
 select, uri: http://localhost:8982/solr/select?wt=ruby, open_timeout: , 
 read_timeout: } ]
 RSolr::Error::Http (RSolr::Error::Http - 400 Bad Request
 Error: org.apache.lucene.queryParser.ParseException: Cannot parse '': 
 Encountered EOF at line 1, column 0.
 Was expecting one of:
 NOT ...
 + ...
 - ...
 ( ...
 * ...
 QUOTED ...
 TERM ...
 PREFIXTERM ...
 WILDTERM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4480) EDisMax parser blows up with query containing single plus or minus

2013-02-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-4480:
--

Fix Version/s: 4.2

 EDisMax parser blows up with query containing single plus or minus
 --

 Key: SOLR-4480
 URL: https://issues.apache.org/jira/browse/SOLR-4480
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Reporter: Fiona Tay
Priority: Minor
 Fix For: 4.2


 We are running solr with sunspot and when we set up a query containing a 
 single plus, Solr blows up with the following error:
 SOLR Request (5.0ms)  [ path=#RSolr::Client:0x4c7464ac parameters={data: 
 fq=type%3A%28Attachment+OR+User+OR+GpdbDataSource+OR+HadoopInstance+OR+GnipInstance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=type_name_s%3A%28Attachment+OR+User+OR+Instance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29q=%2Bfl=%2A+scoreqf=name_texts+first_name_texts+last_name_texts+file_name_textsdefType=edismaxhl=onhl.simple.pre=%40%40%40hl%40%40%40hl.simple.post=%40%40%40endhl%40%40%40start=0rows=3,
  method: post, params: {:wt=:ruby}, query: wt=ruby, headers: 
 {Content-Type=application/x-www-form-urlencoded; charset=UTF-8}, path: 
 select, uri: http://localhost:8982/solr/select?wt=ruby, open_timeout: , 
 read_timeout: } ]
 RSolr::Error::Http (RSolr::Error::Http - 400 Bad Request
 Error: org.apache.lucene.queryParser.ParseException: Cannot parse '': 
 Encountered EOF at line 1, column 0.
 Was expecting one of:
 NOT ...
 + ...
 - ...
 ( ...
 * ...
 QUOTED ...
 TERM ...
 PREFIXTERM ...
 WILDTERM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4480) EDisMax parser blows up with query containing single plus or minus

2013-02-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582725#comment-13582725
 ] 

Jan Høydahl commented on SOLR-4480:
---

Thanks for reporting this. As EDismax is all about being robust and never 
crash, this must be fixed.

 EDisMax parser blows up with query containing single plus or minus
 --

 Key: SOLR-4480
 URL: https://issues.apache.org/jira/browse/SOLR-4480
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Reporter: Fiona Tay
Priority: Critical
 Fix For: 4.2


 We are running solr with sunspot and when we set up a query containing a 
 single plus, Solr blows up with the following error:
 SOLR Request (5.0ms)  [ path=#RSolr::Client:0x4c7464ac parameters={data: 
 fq=type%3A%28Attachment+OR+User+OR+GpdbDataSource+OR+HadoopInstance+OR+GnipInstance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=type_name_s%3A%28Attachment+OR+User+OR+Instance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29q=%2Bfl=%2A+scoreqf=name_texts+first_name_texts+last_name_texts+file_name_textsdefType=edismaxhl=onhl.simple.pre=%40%40%40hl%40%40%40hl.simple.post=%40%40%40endhl%40%40%40start=0rows=3,
  method: post, params: {:wt=:ruby}, query: wt=ruby, headers: 
 {Content-Type=application/x-www-form-urlencoded; charset=UTF-8}, path: 
 select, uri: http://localhost:8982/solr/select?wt=ruby, open_timeout: , 
 read_timeout: } ]
 RSolr::Error::Http (RSolr::Error::Http - 400 Bad Request
 Error: org.apache.lucene.queryParser.ParseException: Cannot parse '': 
 Encountered EOF at line 1, column 0.
 Was expecting one of:
 NOT ...
 + ...
 - ...
 ( ...
 * ...
 QUOTED ...
 TERM ...
 PREFIXTERM ...
 WILDTERM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3633) web UI reports an error if CoreAdminHandler says there are no SolrCores

2013-02-20 Thread Mark Bennett (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582729#comment-13582729
 ] 

Mark Bennett commented on SOLR-3633:


Hi Mark, thanks for the patch.

I see this:
+   // :TODO: Add Core Button

Any thoughts on that?  To me this seems like the most important part of the 
issue.

 web UI reports an error if CoreAdminHandler says there are no SolrCores
 ---

 Key: SOLR-3633
 URL: https://issues.apache.org/jira/browse/SOLR-3633
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0-ALPHA
Reporter: Hoss Man
Assignee: Stefan Matheis (steffkes)
 Fix For: 4.2

 Attachments: SOLR-3633.patch, SOLR-3633.patch


 Spun off from SOLR-3591...
 * having no SolrCores is a valid situation
 * independent of what may happen in SOLR-3591, the web UI should cleanly deal 
 with there being no SolrCores, and just hide/grey out any tabs that can't be 
 supported w/o at least one core
 * even if there are no SolrCores the core admin features (ie: creating a new 
 core) should be accessible in the UI

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4477) match-only query support (terms,wildcards,ranges) for docvalues fields.

2013-02-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved SOLR-4477.
---

   Resolution: Fixed
Fix Version/s: 5.0
   4.2

 match-only query support (terms,wildcards,ranges) for docvalues fields.
 ---

 Key: SOLR-4477
 URL: https://issues.apache.org/jira/browse/SOLR-4477
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.2
Reporter: Robert Muir
 Fix For: 4.2, 5.0

 Attachments: SOLR-4477.patch


 Historically, you had to invert fields (indexed=true) to do any queries 
 against them.
 But now its possible to build a forward index for the field (docValues=true).
 I think in many cases (e.g. a string field you only sort and match on), its 
 unnecessary and wasteful
 to force the user to also invert if they don't need scoring.
 So I think solr should support match-only semantics in this case for 
 term,wildcard,range,etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3633) web UI reports an error if CoreAdminHandler says there are no SolrCores

2013-02-20 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582757#comment-13582757
 ] 

Mark Miller commented on SOLR-3633:
---

I just see it as one of a few pieces, but I only updated the existing patch 
which is essentially just what hossman describes above - I can tweak the UI 
around, but I don't have any immediate plans to develop any features. Hopefully 
the guys that have been pushing the UI forward will lend a hand for further 
work in this area.

 web UI reports an error if CoreAdminHandler says there are no SolrCores
 ---

 Key: SOLR-3633
 URL: https://issues.apache.org/jira/browse/SOLR-3633
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0-ALPHA
Reporter: Hoss Man
Assignee: Stefan Matheis (steffkes)
 Fix For: 4.2

 Attachments: SOLR-3633.patch, SOLR-3633.patch


 Spun off from SOLR-3591...
 * having no SolrCores is a valid situation
 * independent of what may happen in SOLR-3591, the web UI should cleanly deal 
 with there being no SolrCores, and just hide/grey out any tabs that can't be 
 supported w/o at least one core
 * even if there are no SolrCores the core admin features (ie: creating a new 
 core) should be accessible in the UI

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4450) Developer Curb Appeal: Need consistent command line arguments for all nodes

2013-02-20 Thread Paul Doscher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582761#comment-13582761
 ] 

Paul Doscher commented on SOLR-4450:


So what you are saying is you want to copy ElasticSearch?

 Developer Curb Appeal: Need consistent command line arguments for all nodes
 ---

 Key: SOLR-4450
 URL: https://issues.apache.org/jira/browse/SOLR-4450
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.1
Reporter: Mark Bennett
 Fix For: 4.2


 Suppose you want to create a small 4 node cluster (2x2, two shards, each 
 replicated), each on it's own machine.
 It'd be nice to use the same script in /etc/init.d to start them all, but 
 it's hard to come up with a set of arguments that works for both the first 
 and subsequent nodes.
 When MANUALLY starting them, the arguments for the first node are different 
 than for subsequent nodes:
 Node A like this:
 -DzkRun -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig -jar start.jar
 Vs. the other 3 nodes, B, C, D:
   -DzkHost=nodeA:9983 -jar start.jar
 But if you combine them, you either still have to rely on Node A being up 
 first, and have all nodes reference it:
 -DzkRun -DzkHost=nodeA:9983 -DnumShards=2 
 -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=MyConfig
 OR you can try to specify the address of all 4 machines, in all 4 startup 
 scripts, which seems logical but doesn't work:
 -DzkRun -DzkHost=nodeA:9983,nodeB:9983,nodeC:9983,nodeD:9983 
 -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig
 This gives an error:
 org.apache.solr.common.SolrException log
 SEVERE: null:java.lang.IllegalArgumentException: port out of range:-1
 This thread suggests a possible change in syntax, but doesn't seem to work 
 (at least with the embedded ZooKeeper)
 Thread:
 http://lucene.472066.n3.nabble.com/solr4-0-problem-zkHost-with-multiple-hosts-throws-out-of-range-exception-td4014440.html
 Syntax:
 -DzkRun -DzkHost=nodeA:9983,nodeB:9983,nodeC:9983,nodeD:9983/solrroot 
 -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig
 Error:
 SEVERE: Could not start Solr. Check solr/home property and the logs
 Feb 12, 2013 1:36:49 PM org.apache.solr.common.SolrException log
 SEVERE: null:java.lang.NumberFormatException: For input string: 
 9983/solrroot
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
 So:
 * There needs to be some syntax that all nodes can run, even if it requires 
 listing addresses  (or multicast!)
 * And then clear documentation about suggesting external ZooKeeper to be used 
 for production (list being maintained in SOLR-)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4450) Developer Curb Appeal: Need consistent command line arguments for all nodes

2013-02-20 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582769#comment-13582769
 ] 

Shawn Heisey commented on SOLR-4450:


The following paragraph does not address the initial idea in this issue of 
allowing startup re-bootstrapping of config steps when using an init script.  
It only addresses the fact that configName would (IMHO) be a bad option name 
for differentiating multicast.

I'm going to have several config sets stored in zookeeper and even more 
collections that use those config sets, so using something called configName 
for a multicast identifier is *very* confusing.  If multicasting is added to 
Solr, a better name for that option would be mcastName or multicastName.  It 
would be even better to also allow configuring the multicast address and UDP 
port number.


 Developer Curb Appeal: Need consistent command line arguments for all nodes
 ---

 Key: SOLR-4450
 URL: https://issues.apache.org/jira/browse/SOLR-4450
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.1
Reporter: Mark Bennett
 Fix For: 4.2


 Suppose you want to create a small 4 node cluster (2x2, two shards, each 
 replicated), each on it's own machine.
 It'd be nice to use the same script in /etc/init.d to start them all, but 
 it's hard to come up with a set of arguments that works for both the first 
 and subsequent nodes.
 When MANUALLY starting them, the arguments for the first node are different 
 than for subsequent nodes:
 Node A like this:
 -DzkRun -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig -jar start.jar
 Vs. the other 3 nodes, B, C, D:
   -DzkHost=nodeA:9983 -jar start.jar
 But if you combine them, you either still have to rely on Node A being up 
 first, and have all nodes reference it:
 -DzkRun -DzkHost=nodeA:9983 -DnumShards=2 
 -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=MyConfig
 OR you can try to specify the address of all 4 machines, in all 4 startup 
 scripts, which seems logical but doesn't work:
 -DzkRun -DzkHost=nodeA:9983,nodeB:9983,nodeC:9983,nodeD:9983 
 -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig
 This gives an error:
 org.apache.solr.common.SolrException log
 SEVERE: null:java.lang.IllegalArgumentException: port out of range:-1
 This thread suggests a possible change in syntax, but doesn't seem to work 
 (at least with the embedded ZooKeeper)
 Thread:
 http://lucene.472066.n3.nabble.com/solr4-0-problem-zkHost-with-multiple-hosts-throws-out-of-range-exception-td4014440.html
 Syntax:
 -DzkRun -DzkHost=nodeA:9983,nodeB:9983,nodeC:9983,nodeD:9983/solrroot 
 -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig
 Error:
 SEVERE: Could not start Solr. Check solr/home property and the logs
 Feb 12, 2013 1:36:49 PM org.apache.solr.common.SolrException log
 SEVERE: null:java.lang.NumberFormatException: For input string: 
 9983/solrroot
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
 So:
 * There needs to be some syntax that all nodes can run, even if it requires 
 listing addresses  (or multicast!)
 * And then clear documentation about suggesting external ZooKeeper to be used 
 for production (list being maintained in SOLR-)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4450) Developer Curb Appeal: Need consistent command line arguments for all nodes

2013-02-20 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582769#comment-13582769
 ] 

Shawn Heisey edited comment on SOLR-4450 at 2/21/13 1:30 AM:
-

The following paragraph does not address the initial idea in this issue of 
allowing startup re-bootstrapping of config sets when using an init script.  It 
only addresses the fact that configName would (IMHO) be a bad option name for 
differentiating multicast.

I'm going to have several config sets stored in zookeeper and even more 
collections that use those config sets, so using something called configName 
for a multicast identifier is *very* confusing.  If multicasting is added to 
Solr, a better name for that option would be mcastName or multicastName.  It 
would be even better to also allow configuring the multicast address and UDP 
port number.


  was (Author: elyograg):
The following paragraph does not address the initial idea in this issue of 
allowing startup re-bootstrapping of config steps when using an init script.  
It only addresses the fact that configName would (IMHO) be a bad option name 
for differentiating multicast.

I'm going to have several config sets stored in zookeeper and even more 
collections that use those config sets, so using something called configName 
for a multicast identifier is *very* confusing.  If multicasting is added to 
Solr, a better name for that option would be mcastName or multicastName.  It 
would be even better to also allow configuring the multicast address and UDP 
port number.

  
 Developer Curb Appeal: Need consistent command line arguments for all nodes
 ---

 Key: SOLR-4450
 URL: https://issues.apache.org/jira/browse/SOLR-4450
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.1
Reporter: Mark Bennett
 Fix For: 4.2


 Suppose you want to create a small 4 node cluster (2x2, two shards, each 
 replicated), each on it's own machine.
 It'd be nice to use the same script in /etc/init.d to start them all, but 
 it's hard to come up with a set of arguments that works for both the first 
 and subsequent nodes.
 When MANUALLY starting them, the arguments for the first node are different 
 than for subsequent nodes:
 Node A like this:
 -DzkRun -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig -jar start.jar
 Vs. the other 3 nodes, B, C, D:
   -DzkHost=nodeA:9983 -jar start.jar
 But if you combine them, you either still have to rely on Node A being up 
 first, and have all nodes reference it:
 -DzkRun -DzkHost=nodeA:9983 -DnumShards=2 
 -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=MyConfig
 OR you can try to specify the address of all 4 machines, in all 4 startup 
 scripts, which seems logical but doesn't work:
 -DzkRun -DzkHost=nodeA:9983,nodeB:9983,nodeC:9983,nodeD:9983 
 -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig
 This gives an error:
 org.apache.solr.common.SolrException log
 SEVERE: null:java.lang.IllegalArgumentException: port out of range:-1
 This thread suggests a possible change in syntax, but doesn't seem to work 
 (at least with the embedded ZooKeeper)
 Thread:
 http://lucene.472066.n3.nabble.com/solr4-0-problem-zkHost-with-multiple-hosts-throws-out-of-range-exception-td4014440.html
 Syntax:
 -DzkRun -DzkHost=nodeA:9983,nodeB:9983,nodeC:9983,nodeD:9983/solrroot 
 -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig
 Error:
 SEVERE: Could not start Solr. Check solr/home property and the logs
 Feb 12, 2013 1:36:49 PM org.apache.solr.common.SolrException log
 SEVERE: null:java.lang.NumberFormatException: For input string: 
 9983/solrroot
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
 So:
 * There needs to be some syntax that all nodes can run, even if it requires 
 listing addresses  (or multicast!)
 * And then clear documentation about suggesting external ZooKeeper to be used 
 for production (list being maintained in SOLR-)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4450) Developer Curb Appeal: Need consistent command line arguments for all nodes

2013-02-20 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582771#comment-13582771
 ] 

Mark Miller commented on SOLR-4450:
---

bq. which seems logical but doesn't work

It doesn't work because you are using ZkRun incorrectly - probably because it 
is not documented well on the wiki.

 Developer Curb Appeal: Need consistent command line arguments for all nodes
 ---

 Key: SOLR-4450
 URL: https://issues.apache.org/jira/browse/SOLR-4450
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.1
Reporter: Mark Bennett
 Fix For: 4.2


 Suppose you want to create a small 4 node cluster (2x2, two shards, each 
 replicated), each on it's own machine.
 It'd be nice to use the same script in /etc/init.d to start them all, but 
 it's hard to come up with a set of arguments that works for both the first 
 and subsequent nodes.
 When MANUALLY starting them, the arguments for the first node are different 
 than for subsequent nodes:
 Node A like this:
 -DzkRun -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig -jar start.jar
 Vs. the other 3 nodes, B, C, D:
   -DzkHost=nodeA:9983 -jar start.jar
 But if you combine them, you either still have to rely on Node A being up 
 first, and have all nodes reference it:
 -DzkRun -DzkHost=nodeA:9983 -DnumShards=2 
 -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=MyConfig
 OR you can try to specify the address of all 4 machines, in all 4 startup 
 scripts, which seems logical but doesn't work:
 -DzkRun -DzkHost=nodeA:9983,nodeB:9983,nodeC:9983,nodeD:9983 
 -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig
 This gives an error:
 org.apache.solr.common.SolrException log
 SEVERE: null:java.lang.IllegalArgumentException: port out of range:-1
 This thread suggests a possible change in syntax, but doesn't seem to work 
 (at least with the embedded ZooKeeper)
 Thread:
 http://lucene.472066.n3.nabble.com/solr4-0-problem-zkHost-with-multiple-hosts-throws-out-of-range-exception-td4014440.html
 Syntax:
 -DzkRun -DzkHost=nodeA:9983,nodeB:9983,nodeC:9983,nodeD:9983/solrroot 
 -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf 
 -Dcollection.configName=MyConfig
 Error:
 SEVERE: Could not start Solr. Check solr/home property and the logs
 Feb 12, 2013 1:36:49 PM org.apache.solr.common.SolrException log
 SEVERE: null:java.lang.NumberFormatException: For input string: 
 9983/solrroot
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
 So:
 * There needs to be some syntax that all nodes can run, even if it requires 
 listing addresses  (or multicast!)
 * And then clear documentation about suggesting external ZooKeeper to be used 
 for production (list being maintained in SOLR-)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4481) SwitchQParserPlugin

2013-02-20 Thread Hoss Man (JIRA)
Hoss Man created SOLR-4481:
--

 Summary: SwitchQParserPlugin
 Key: SOLR-4481
 URL: https://issues.apache.org/jira/browse/SOLR-4481
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man


Inspired by a conversation i had with someone on IRC a while back about using 
append fq params + local params to create custom request params, it occurred 
to me that it would be handy to have a switch qparser that could be 
configured with some set of fixed switch case localparams that it would 
delegate too based on it's input string.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >