[jira] [Commented] (SOLR-5088) ClassCastException is thrown when trying to use custom SearchHandler.

2013-07-30 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723503#comment-13723503
 ] 

Mikhail Khludnev commented on SOLR-5088:


Hello,

Can you tell how you run that code? I'm concerned by java thread pool in the 
bottom of the stack trace. 
I'd suggest you setup exception breakpoint and evaluate classloaders for your 
class, and its' ascendent. Pay attention to the parent and url fields. Post 
your observations here.

 ClassCastException is thrown when trying to use custom SearchHandler.
 -

 Key: SOLR-5088
 URL: https://issues.apache.org/jira/browse/SOLR-5088
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Pavel Yaskevich

 Hi guys,
   I'm trying to replace solr.SearchHandler to custom one in solrconfig.xml 
 for one of the stores, and it's throwing following exception: 
 {noformat}
 Caused by: org.apache.solr.common.SolrException: RequestHandler init failure
   at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:167)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:772)
   ... 13 more
 Caused by: org.apache.solr.common.SolrException: Error Instantiating Request 
 Handler, org.my.solr.index.CustomSearchHandler failed to instantiate 
 org.apache.solr.request.SolrRequestHandler
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551)
   at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:603)
   at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:153)
   ... 14 more
 Caused by: java.lang.ClassCastException: class 
 org.my.solr.index.CustomSearchHandler
   at java.lang.Class.asSubclass(Class.java:3116)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:381)
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:530)
   ... 16 more
 {noformat}
 I actually tried extending SearchHandler, and implementing SolrRequestHandler 
 as well as extending RequestHandlerBase and it's all the same 
 ClassCastException result...
 org.my.solr.index.CustomSearchHandler is definitely in class path and 
 recompiled every retry. 
 Maybe I'm doing something terribly wrong?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5088) ClassCastException is thrown when trying to use custom SearchHandler.

2013-07-30 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723521#comment-13723521
 ] 

Pavel Yaskevich commented on SOLR-5088:
---

It's Solr+Jetty, also tried using EmbeddedSolrServer with the same result. I 
thought about classloader not picking up the class myself, but if I tell it the 
wrong class it throws expected ClassNotFoundException in Class.forName, where 
with existing class it actually fails in asSubclass method with 
ClassCastException, which means that the custom class was found. 

Looking at the code of SolrResourceLoader.java:443 it has trace in place, I 
will try to run with log level set to trace tomorrow and post results here, 
maybe that will give more clarity...

 ClassCastException is thrown when trying to use custom SearchHandler.
 -

 Key: SOLR-5088
 URL: https://issues.apache.org/jira/browse/SOLR-5088
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Pavel Yaskevich

 Hi guys,
   I'm trying to replace solr.SearchHandler to custom one in solrconfig.xml 
 for one of the stores, and it's throwing following exception: 
 {noformat}
 Caused by: org.apache.solr.common.SolrException: RequestHandler init failure
   at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:167)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:772)
   ... 13 more
 Caused by: org.apache.solr.common.SolrException: Error Instantiating Request 
 Handler, org.my.solr.index.CustomSearchHandler failed to instantiate 
 org.apache.solr.request.SolrRequestHandler
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551)
   at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:603)
   at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:153)
   ... 14 more
 Caused by: java.lang.ClassCastException: class 
 org.my.solr.index.CustomSearchHandler
   at java.lang.Class.asSubclass(Class.java:3116)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:381)
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:530)
   ... 16 more
 {noformat}
 I actually tried extending SearchHandler, and implementing SolrRequestHandler 
 as well as extending RequestHandlerBase and it's all the same 
 ClassCastException result...
 org.my.solr.index.CustomSearchHandler is definitely in class path and 
 recompiled every retry. 
 Maybe I'm doing something terribly wrong?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5088) ClassCastException is thrown when trying to use custom SearchHandler.

2013-07-30 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723532#comment-13723532
 ] 

Pavel Yaskevich commented on SOLR-5088:
---

I couldn't wait until tomorrow and I run everything with TRACE, this is the 
line from log that confirms that custom class is loaded (among other 100+ 
loaded classes...

{noformat}
DEBUG 00:46:25,705 loaded class org.my.solr.index.CustomQueryComponent from 
sun.misc.Launcher$AppClassLoader@6fb9658e
{noformat}


 ClassCastException is thrown when trying to use custom SearchHandler.
 -

 Key: SOLR-5088
 URL: https://issues.apache.org/jira/browse/SOLR-5088
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Pavel Yaskevich

 Hi guys,
   I'm trying to replace solr.SearchHandler to custom one in solrconfig.xml 
 for one of the stores, and it's throwing following exception: 
 {noformat}
 Caused by: org.apache.solr.common.SolrException: RequestHandler init failure
   at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:167)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:772)
   ... 13 more
 Caused by: org.apache.solr.common.SolrException: Error Instantiating Request 
 Handler, org.my.solr.index.CustomSearchHandler failed to instantiate 
 org.apache.solr.request.SolrRequestHandler
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551)
   at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:603)
   at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:153)
   ... 14 more
 Caused by: java.lang.ClassCastException: class 
 org.my.solr.index.CustomSearchHandler
   at java.lang.Class.asSubclass(Class.java:3116)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:381)
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:530)
   ... 16 more
 {noformat}
 I actually tried extending SearchHandler, and implementing SolrRequestHandler 
 as well as extending RequestHandlerBase and it's all the same 
 ClassCastException result...
 org.my.solr.index.CustomSearchHandler is definitely in class path and 
 recompiled every retry. 
 Maybe I'm doing something terribly wrong?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5089) OverseerCollectionProcessorTest does not fail on assertions thrown by mock objects

2013-07-30 Thread Shalin Shekhar Mangar (JIRA)
Shalin Shekhar Mangar created SOLR-5089:
---

 Summary: OverseerCollectionProcessorTest does not fail on 
assertions thrown by mock objects
 Key: SOLR-5089
 URL: https://issues.apache.org/jira/browse/SOLR-5089
 Project: Solr
  Issue Type: Bug
  Components: Tests
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 4.5


The OverseerCollectionProcessorTest uses EasyMock for testing but the test does 
not fail if the mock object throws assertions because of unexpected method 
calls.

For example, I modified the Overseer to NOT throw an exception if 
maxShardsAllowedToCreate  requestedShardsToCreate. The mock objects logs an 
exception with an AssertionError but the test still passes.

{code}
  [junit4]   2 1158 T11 oas.SolrTestCaseJ4.setUp ###Starting 
testNoReplicationCollectionNotCreatedDueToMaxShardsPerNodeAndNodesToCreateOnLimits
   [junit4]   2 1195 T12 oasc.OverseerCollectionProcessor.run Process current 
queue of collection creations
   [junit4]   2 2215 T12 oasc.OverseerCollectionProcessor.run Overseer 
Collection Processor: Get the message id:id message:{
   [junit4]   2  replicationFactor:1,
   [junit4]   2  operation:createcollection,
   [junit4]   2  numShards:8,
   [junit4]   2  maxShardsPerNode:2,
   [junit4]   2  collection.configName:myconfig,
   [junit4]   2  
createNodeSet:localhost:8964_solr,localhost:8966_solr,localhost:8963_solr,
   [junit4]   2  name:mycollection}
   [junit4]   2 2216 T12 oasc.OverseerCollectionProcessor.createCollection 
Creating shard mycollection_shard1_replica1 as part of slice shard1 of 
collection mycollection on localhost:8964_solr
   [junit4]   2 2242 T12 oasc.SolrException.log ERROR Collection 
createcollection of createcollection failed:java.lang.AssertionError:
   [junit4]   2  Unexpected method call 
submit(ShardRequest:{params=action=CREATEname=mycollection_shard1_replica1collection.configName=myconfigcollection=mycollectionshard=shard1numShards=8qt=%2Fadmin%2Fcores,
 purpose=1, nResponses =0}, localhost:8964/solr, 
action=CREATEname=mycollection_shard1_replica1collection.configName=myconfigcollection=mycollectionshard=shard1numShards=8qt=%2Fadmin%2Fcores):
   [junit4]   2at 
org.easymock.internal.MockInvocationHandler.invoke(MockInvocationHandler.java:45)
   [junit4]   2at 
org.easymock.internal.ObjectMethodsFilter.invoke(ObjectMethodsFilter.java:73)
   [junit4]   2at 
org.easymock.internal.ClassProxyFactory$MockMethodInterceptor.intercept(ClassProxyFactory.java:69)
   [junit4]   2at 
org.apache.solr.handler.component.ShardHandler$$EnhancerByCGLIB$$27b6a726.submit(generated)
   [junit4]   2at 
org.apache.solr.cloud.OverseerCollectionProcessor.createCollection(OverseerCollectionProcessor.java:838)
   [junit4]   2at 
org.apache.solr.cloud.OverseerCollectionProcessor.processMessage(OverseerCollectionProcessor.java:175)
   [junit4]   2at 
org.apache.solr.cloud.OverseerCollectionProcessorTest$OverseerCollectionProcessorToBeTested.processMessage(OverseerCollectionProcessorTest.java:95)
   [junit4]   2at 
org.apache.solr.cloud.OverseerCollectionProcessor.run(OverseerCollectionProcessor.java:127)
   [junit4]   2at java.lang.Thread.run(Thread.java:724)
   [junit4]   2
   [junit4]   2 2259 T12 oasc.OverseerCollectionProcessor.run Overseer 
Collection Processor: Message id:id complete, response:{Operation 
createcollection caused exception:=java.lang.AssertionError:
   [junit4]   2  Unexpected method call 
submit(ShardRequest:{params=action=CREATEname=mycollection_shard1_replica1collection.configName=myconfigcollection=mycollectionshard=shard1numShards=8qt=%2Fadmin%2Fcores,
 purpose=1, nResponses =0}, localhost:8964/solr, 
action=CREATEname=mycollection_shard1_replica1collection.configName=myconfigcollection=mycollectionshard=shard1numShards=8qt=%2Fadmin%2Fcores):,exception={msg=
   [junit4]   2  Unexpected method call 
submit(ShardRequest:{params=action=CREATEname=mycollection_shard1_replica1collection.configName=myconfigcollection=mycollectionshard=shard1numShards=8qt=%2Fadmin%2Fcores,
 purpose=1, nResponses =0}, localhost:8964/solr, 
action=CREATEname=mycollection_shard1_replica1collection.configName=myconfigcollection=mycollectionshard=shard1numShards=8qt=%2Fadmin%2Fcores):,rspCode=-1}}
   [junit4]   2 2307 T11 oas.SolrTestCaseJ4.tearDown ###Ending 
testNoReplicationCollectionNotCreatedDueToMaxShardsPerNodeAndNodesToCreateOnLimits
   [junit4] OK  1.28s | 
OverseerCollectionProcessorTest.testNoReplicationCollectionNotCreatedDueToMaxShardsPerNodeAndNodesToCreateOnLimits
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please 

[jira] [Commented] (SOLR-5088) ClassCastException is thrown when trying to use custom SearchHandler.

2013-07-30 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723586#comment-13723586
 ] 

Mikhail Khludnev commented on SOLR-5088:


still not clear how you run it. don't you put your classes as a lib in jetty? 
my only _guess_: your classes are located at some bottom classloader (and 
consequently it triggers loading solr classes), after that app/jetty loads solr 
classes in descended classloader, whether it jetty's web-apps classloader or 
SolrResourceLoader.classloader that breaks chain of classloader, that fails 
asSubclass. 
Please make sure that you follow instructions 
http://wiki.apache.org/solr/SolrPlugins#How_to_Load_Plugins 

 ClassCastException is thrown when trying to use custom SearchHandler.
 -

 Key: SOLR-5088
 URL: https://issues.apache.org/jira/browse/SOLR-5088
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Pavel Yaskevich

 Hi guys,
   I'm trying to replace solr.SearchHandler to custom one in solrconfig.xml 
 for one of the stores, and it's throwing following exception: 
 {noformat}
 Caused by: org.apache.solr.common.SolrException: RequestHandler init failure
   at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:167)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:772)
   ... 13 more
 Caused by: org.apache.solr.common.SolrException: Error Instantiating Request 
 Handler, org.my.solr.index.CustomSearchHandler failed to instantiate 
 org.apache.solr.request.SolrRequestHandler
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551)
   at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:603)
   at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:153)
   ... 14 more
 Caused by: java.lang.ClassCastException: class 
 org.my.solr.index.CustomSearchHandler
   at java.lang.Class.asSubclass(Class.java:3116)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:381)
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:530)
   ... 16 more
 {noformat}
 I actually tried extending SearchHandler, and implementing SolrRequestHandler 
 as well as extending RequestHandlerBase and it's all the same 
 ClassCastException result...
 org.my.solr.index.CustomSearchHandler is definitely in class path and 
 recompiled every retry. 
 Maybe I'm doing something terribly wrong?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)

2013-07-30 Thread Boaz Leskes (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723590#comment-13723590
 ] 

Boaz Leskes edited comment on LUCENE-5145 at 7/30/13 8:59 AM:
--

Based on Adrien's comments, I changed MultiDocValues.subIndexes, 
SortedDocValuesWriter.pending , SortedSetDocValuesWriter.pending use new 
AppendingPackedLongBuffer (see v2 patch)

  was (Author: bleskes):
Based on Adrien's comments, I changed MultiDocValues.subIndexes, 
SortedDocValuesWriter.pending , SortedSetDocValuesWriter.pending use new 
AppendingPackedLongBuffer
  
 Added AppendingPackedLongBuffer  extended AbstractAppendingLongBuffer family 
 (customizable compression ratio + bulk retrieval)
 ---

 Key: LUCENE-5145
 URL: https://issues.apache.org/jira/browse/LUCENE-5145
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Boaz Leskes
Assignee: Adrien Grand
 Attachments: LUCENE-5145.patch, LUCENE-5145.v2.patch


 Made acceptableOverheadRatio configurable 
 Added bulk get to AbstractAppendingLongBuffer classes, for faster retrieval.
 Introduced a new variant, AppendingPackedLongBuffer which solely relies on 
 PackedInts as a back-end. This new class is useful where people have 
 non-negative numbers with a fairly uniform distribution over a fixed 
 (limited) range. Ex. facets ordinals.
 To distinguish it from AppendingPackedLongBuffer, delta based 
 AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer
 Fixed an Issue with NullReader where it didn't respect it's valueCount in 
 bulk gets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)

2013-07-30 Thread Boaz Leskes (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boaz Leskes updated LUCENE-5145:


Attachment: LUCENE-5145.v2.patch

Based on Adrien's comments, I changed MultiDocValues.subIndexes, 
SortedDocValuesWriter.pending , SortedSetDocValuesWriter.pending use new 
AppendingPackedLongBuffer

 Added AppendingPackedLongBuffer  extended AbstractAppendingLongBuffer family 
 (customizable compression ratio + bulk retrieval)
 ---

 Key: LUCENE-5145
 URL: https://issues.apache.org/jira/browse/LUCENE-5145
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Boaz Leskes
Assignee: Adrien Grand
 Attachments: LUCENE-5145.patch, LUCENE-5145.v2.patch


 Made acceptableOverheadRatio configurable 
 Added bulk get to AbstractAppendingLongBuffer classes, for faster retrieval.
 Introduced a new variant, AppendingPackedLongBuffer which solely relies on 
 PackedInts as a back-end. This new class is useful where people have 
 non-negative numbers with a fairly uniform distribution over a fixed 
 (limited) range. Ex. facets ordinals.
 To distinguish it from AppendingPackedLongBuffer, delta based 
 AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer
 Fixed an Issue with NullReader where it didn't respect it's valueCount in 
 bulk gets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5079) Create ngroups for pivot faceting

2013-07-30 Thread Sandro Mario Zbinden (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandro Mario Zbinden updated SOLR-5079:
---

Remaining Estimate: 4h  (was: 24h)
 Original Estimate: 4h  (was: 24h)

 Create ngroups for pivot faceting
 -

 Key: SOLR-5079
 URL: https://issues.apache.org/jira/browse/SOLR-5079
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0, 4.5
Reporter: Sandro Mario Zbinden
  Labels: facet, pivot
 Attachments: SOLR-5079.patch, SOLR-5079.patch

   Original Estimate: 4h
  Remaining Estimate: 4h

 To save network traffic it would be great to now how many entries a facet 
 list contains without loading the complete facet list. This issue is created 
 because of an out of memory in loading the pivot facet with facet.limit set 
 to -1.
 The facet.pivot result would then look like
 q=facet.pivot=cat,id*facet.pivot.ngroup=true*
 {code:xml}
 arr name=cat,id
  lst
str name=fieldcat/str
str name=valuea/str
int name=count20/int
arr name=pivot
lst
  str name=fieldid/str
  int name=value69/int
  int name=count10/int
/lst
lst
  str name=fieldid/str
  int name=value71/int
  int name=count10/int
/lst
int name=ngroup2/int !-- The new ngroup parm -- 
  /lst
 /arr
 {code}
 If you add another new param for example facet.pivot.visible the
 result could create less traffic
 especially if there are a lot of ids and the param facet.limit=-1 is set
 q=facet.pivot=cat,id*facet.ngroup=truef.id.facet.pivot.visible=false*
 {code:xml}
 arr name=cat,id
  lst
str name=fieldcat/str
str name=valuea/str
int name=count20/int
!-- No pivot list of id --
int name=ngroup2/int 
  /lst
 /arr
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5146) AnalyzingSuggester sort order doesn't respect the actual weight

2013-07-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5146:


Attachment: LUCENE-5146.patch

new patch including CHANGES.TXT entry and removed compiler warnings caused by 
the test helper.

 AnalyzingSuggester sort order doesn't respect the actual weight
 ---

 Key: LUCENE-5146
 URL: https://issues.apache.org/jira/browse/LUCENE-5146
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 4.4
Reporter: Simon Willnauer
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5146.patch, LUCENE-5146.patch


 Uwe would say: sorry but your code is wrong. We don't actually read the 
 weight value in AnalyzingComparator which can cause really odd suggestions 
 since we read parts of the input as the weight. Non of our tests catches that 
 so I will go ahead and add some tests for it as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5146) AnalyzingSuggester sort order doesn't respect the actual weight

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723695#comment-13723695
 ] 

ASF subversion and git services commented on LUCENE-5146:
-

Commit 1508382 from [~simonw] in branch 'dev/trunk'
[ https://svn.apache.org/r1508382 ]

LUCENE-5146: AnalyzingSuggester sort order doesn't respect the actual weight

 AnalyzingSuggester sort order doesn't respect the actual weight
 ---

 Key: LUCENE-5146
 URL: https://issues.apache.org/jira/browse/LUCENE-5146
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 4.4
Reporter: Simon Willnauer
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5146.patch, LUCENE-5146.patch


 Uwe would say: sorry but your code is wrong. We don't actually read the 
 weight value in AnalyzingComparator which can cause really odd suggestions 
 since we read parts of the input as the weight. Non of our tests catches that 
 so I will go ahead and add some tests for it as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5146) AnalyzingSuggester sort order doesn't respect the actual weight

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723699#comment-13723699
 ] 

ASF subversion and git services commented on LUCENE-5146:
-

Commit 1508384 from [~simonw] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1508384 ]

LUCENE-5146: AnalyzingSuggester sort order doesn't respect the actual weight

 AnalyzingSuggester sort order doesn't respect the actual weight
 ---

 Key: LUCENE-5146
 URL: https://issues.apache.org/jira/browse/LUCENE-5146
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 4.4
Reporter: Simon Willnauer
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5146.patch, LUCENE-5146.patch


 Uwe would say: sorry but your code is wrong. We don't actually read the 
 weight value in AnalyzingComparator which can cause really odd suggestions 
 since we read parts of the input as the weight. Non of our tests catches that 
 so I will go ahead and add some tests for it as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5146) AnalyzingSuggester sort order doesn't respect the actual weight

2013-07-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-5146.
-

Resolution: Fixed
  Assignee: Simon Willnauer

 AnalyzingSuggester sort order doesn't respect the actual weight
 ---

 Key: LUCENE-5146
 URL: https://issues.apache.org/jira/browse/LUCENE-5146
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 4.4
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5146.patch, LUCENE-5146.patch


 Uwe would say: sorry but your code is wrong. We don't actually read the 
 weight value in AnalyzingComparator which can cause really odd suggestions 
 since we read parts of the input as the weight. Non of our tests catches that 
 so I will go ahead and add some tests for it as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5090) NPE in DirectSpellChecker with alternativeTermCount and mm.

2013-07-30 Thread Markus Jelsma (JIRA)
Markus Jelsma created SOLR-5090:
---

 Summary: NPE in DirectSpellChecker with alternativeTermCount and 
mm.
 Key: SOLR-5090
 URL: https://issues.apache.org/jira/browse/SOLR-5090
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 4.4
 Environment: 4.4.0 1504776 - sarowe - 2013-07-19 02:58:35
Reporter: Markus Jelsma
 Fix For: 5.0, 4.5


Query with three terms of which one is misspelled and 
spellcheck.alternativeTermCount=0mm=3 yields the following NPE:

{code}
ERROR org.apache.solr.servlet.SolrDispatchFilter  – 
null:java.lang.NullPointerException
at 
org.apache.lucene.search.spell.DirectSpellChecker.suggestSimilar(DirectSpellChecker.java:422)
at 
org.apache.lucene.search.spell.DirectSpellChecker.suggestSimilar(DirectSpellChecker.java:355)
at 
org.apache.solr.spelling.DirectSolrSpellChecker.getSuggestions(DirectSolrSpellChecker.java:189)
at 
org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:188)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5090) NPE in DirectSpellChecker with alternativeTermCount and mm.

2013-07-30 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723757#comment-13723757
 ] 

Markus Jelsma commented on SOLR-5090:
-

NPE does not always show up if there is at least one misspelling. The following 
yields and error:
q=zinkoide vaseline cremespellcheck.alternativeTermCount=0mm=3 (zinkoxide is 
misspelled as zinkoide)

but this one doesn't:
q=zinkoide vaseline crèmespellcheck.alternativeTermCount=0mm=3 (note the 
accent)

Accents are folded in our the analyzers but not in the spellchecked field.

 NPE in DirectSpellChecker with alternativeTermCount and mm.
 ---

 Key: SOLR-5090
 URL: https://issues.apache.org/jira/browse/SOLR-5090
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 4.4
 Environment: 4.4.0 1504776 - sarowe - 2013-07-19 02:58:35
Reporter: Markus Jelsma
 Fix For: 5.0, 4.5


 Query with three terms of which one is misspelled and 
 spellcheck.alternativeTermCount=0mm=3 yields the following NPE:
 {code}
 ERROR org.apache.solr.servlet.SolrDispatchFilter  – 
 null:java.lang.NullPointerException
 at 
 org.apache.lucene.search.spell.DirectSpellChecker.suggestSimilar(DirectSpellChecker.java:422)
 at 
 org.apache.lucene.search.spell.DirectSpellChecker.suggestSimilar(DirectSpellChecker.java:355)
 at 
 org.apache.solr.spelling.DirectSolrSpellChecker.getSuggestions(DirectSolrSpellChecker.java:189)
 at 
 org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:188)
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5057) queryResultCache should not related with the order of fq's list

2013-07-30 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723769#comment-13723769
 ] 

Erick Erickson commented on SOLR-5057:
--

You can certainly submit the patch. So take Yonik's version,
put it in your code and try your test.

Then, please run the entire test suite (i.e. execute
'ant clean test' from the root).

But sure, then you can submit the patch. I think Yonik's
version address Hoss's comments, it seems to me that this
patch preserves efficiency without having to make it
a several step operation to handle this case.

 queryResultCache should not related with the order of fq's list
 ---

 Key: SOLR-5057
 URL: https://issues.apache.org/jira/browse/SOLR-5057
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 4.0, 4.1, 4.2, 4.3
Reporter: Feihong Huang
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5057.patch, SOLR-5057.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 There are two case query with the same meaning below. But the case2 can't use 
 the queryResultCache when case1 is executed.
 case1: q=*:*fq=field1:value1fq=field2:value2
 case2: q=*:*fq=field2:value2fq=field1:value1
 I think queryResultCache should not be related with the order of fq's list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException

2013-07-30 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-5087:


Assignee: Erick Erickson  (was: Patrick Hunt)

 CoreAdminHandler.handleMergeAction generating NullPointerException
 --

 Key: SOLR-5087
 URL: https://issues.apache.org/jira/browse/SOLR-5087
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Patrick Hunt
Assignee: Erick Erickson
 Fix For: 5.0, 4.5

 Attachments: SOLR-5087.patch


 CoreAdminHandler.handleMergeAction is generating NullPointerException
 If directoryFactory.get(...) in handleMergeAction throws an exception the 
 original error is lost as the finally clause will attempt to clean up and 
 generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls 
 that are not filled in)
 {noformat}
 ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException
 at 
 org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5146) AnalyzingSuggester sort order doesn't respect the actual weight

2013-07-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723788#comment-13723788
 ] 

Michael McCandless commented on LUCENE-5146:


Thanks Simon!

 AnalyzingSuggester sort order doesn't respect the actual weight
 ---

 Key: LUCENE-5146
 URL: https://issues.apache.org/jira/browse/LUCENE-5146
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 4.4
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5146.patch, LUCENE-5146.patch


 Uwe would say: sorry but your code is wrong. We don't actually read the 
 weight value in AnalyzingComparator which can cause really odd suggestions 
 since we read parts of the input as the weight. Non of our tests catches that 
 so I will go ahead and add some tests for it as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException

2013-07-30 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723790#comment-13723790
 ] 

Erick Erickson commented on SOLR-5087:
--

Testing, merging and committing shortly.

 CoreAdminHandler.handleMergeAction generating NullPointerException
 --

 Key: SOLR-5087
 URL: https://issues.apache.org/jira/browse/SOLR-5087
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Patrick Hunt
Assignee: Erick Erickson
 Fix For: 5.0, 4.5

 Attachments: SOLR-5087.patch


 CoreAdminHandler.handleMergeAction is generating NullPointerException
 If directoryFactory.get(...) in handleMergeAction throws an exception the 
 original error is lost as the finally clause will attempt to clean up and 
 generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls 
 that are not filled in)
 {noformat}
 ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException
 at 
 org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-NightlyTests-trunk - Build # 335 - Failure

2013-07-30 Thread Michael McCandless
I committed a fix.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Jul 30, 2013 at 1:09 AM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/335/

 2 tests failed.
 FAILED:  
 org.apache.lucene.codecs.simpletext.TestSimpleTextPostingsFormat.testEmptyField

 Error Message:


 Stack Trace:
 java.lang.AssertionError
 at 
 __randomizedtesting.SeedInfo.seed([6D64DFCE9911F67B:B07B8389FB8EACE0]:0)
 at org.junit.Assert.fail(Assert.java:92)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at org.junit.Assert.assertTrue(Assert.java:54)
 at 
 org.apache.lucene.index.BasePostingsFormatTestCase.testEmptyField(BasePostingsFormatTestCase.java:1154)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at java.lang.Thread.run(Thread.java:724)


 FAILED:  
 org.apache.lucene.codecs.simpletext.TestSimpleTextPostingsFormat.testEmptyFieldAndEmptyTerm

 Error Message:


 Stack Trace:
 java.lang.AssertionError
 at 
 __randomizedtesting.SeedInfo.seed([6D64DFCE9911F67B:EF1EF4C8B9869F55]:0)
 at 

[jira] [Commented] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723804#comment-13723804
 ] 

ASF subversion and git services commented on LUCENE-5145:
-

Commit 1508423 from [~jpountz] in branch 'dev/trunk'
[ https://svn.apache.org/r1508423 ]

LUCENE-5145: AppendingPackedLongBuffer and added suport for bulk get operations 
to the Appending*Buffers.

Introduced bulk retrieval to AbstractAppendingLongBuffer
classes, for faster retrieval. Introduced a new variant,
AppendingPackedLongBuffer which solely relies on PackedInts as a backend.
This new class is useful where people have non-negative numbers with a
uniform distribution over a fixed (limited) range. Ex. facets ordinals. To
distinguish it from AppendingPackedLongBuffer, delta based
AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer Fixed an
Issue with NullReader where it didn't respect it's valueCount in bulk gets.

 Added AppendingPackedLongBuffer  extended AbstractAppendingLongBuffer family 
 (customizable compression ratio + bulk retrieval)
 ---

 Key: LUCENE-5145
 URL: https://issues.apache.org/jira/browse/LUCENE-5145
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Boaz Leskes
Assignee: Adrien Grand
 Attachments: LUCENE-5145.patch, LUCENE-5145.v2.patch


 Made acceptableOverheadRatio configurable 
 Added bulk get to AbstractAppendingLongBuffer classes, for faster retrieval.
 Introduced a new variant, AppendingPackedLongBuffer which solely relies on 
 PackedInts as a back-end. This new class is useful where people have 
 non-negative numbers with a fairly uniform distribution over a fixed 
 (limited) range. Ex. facets ordinals.
 To distinguish it from AppendingPackedLongBuffer, delta based 
 AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer
 Fixed an Issue with NullReader where it didn't respect it's valueCount in 
 bulk gets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723818#comment-13723818
 ] 

ASF subversion and git services commented on LUCENE-5145:
-

Commit 1508430 from [~jpountz] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1508430 ]

LUCENE-5145: AppendingPackedLongBuffer and added suport for bulk get operations 
to the Appending*Buffers.

Introduced bulk retrieval to AbstractAppendingLongBuffer
classes, for faster retrieval. Introduced a new variant,
AppendingPackedLongBuffer which solely relies on PackedInts as a backend.
This new class is useful where people have non-negative numbers with a
uniform distribution over a fixed (limited) range. Ex. facets ordinals. To
distinguish it from AppendingPackedLongBuffer, delta based
AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer Fixed an
Issue with NullReader where it didn't respect it's valueCount in bulk gets.

 Added AppendingPackedLongBuffer  extended AbstractAppendingLongBuffer family 
 (customizable compression ratio + bulk retrieval)
 ---

 Key: LUCENE-5145
 URL: https://issues.apache.org/jira/browse/LUCENE-5145
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Boaz Leskes
Assignee: Adrien Grand
 Attachments: LUCENE-5145.patch, LUCENE-5145.v2.patch


 Made acceptableOverheadRatio configurable 
 Added bulk get to AbstractAppendingLongBuffer classes, for faster retrieval.
 Introduced a new variant, AppendingPackedLongBuffer which solely relies on 
 PackedInts as a back-end. This new class is useful where people have 
 non-negative numbers with a fairly uniform distribution over a fixed 
 (limited) range. Ex. facets ordinals.
 To distinguish it from AppendingPackedLongBuffer, delta based 
 AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer
 Fixed an Issue with NullReader where it didn't respect it's valueCount in 
 bulk gets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5151) Associations aggregators enter an infinite loop if some documents have no category associations

2013-07-30 Thread Shai Erera (JIRA)
Shai Erera created LUCENE-5151:
--

 Summary: Associations aggregators enter an infinite loop if some 
documents have no category associations
 Key: LUCENE-5151
 URL: https://issues.apache.org/jira/browse/LUCENE-5151
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera


Stupid error, they do this:

{code}
if (bytes.length == 0) {
  continue;
}
{code}

Since they don't advance 'doc', the hang on that if forever. I'll post a fix 
soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)

2013-07-30 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-5145.
--

   Resolution: Fixed
Fix Version/s: 4.5
   5.0

Committed. Thanks Boaz!

 Added AppendingPackedLongBuffer  extended AbstractAppendingLongBuffer family 
 (customizable compression ratio + bulk retrieval)
 ---

 Key: LUCENE-5145
 URL: https://issues.apache.org/jira/browse/LUCENE-5145
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Boaz Leskes
Assignee: Adrien Grand
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5145.patch, LUCENE-5145.v2.patch


 Made acceptableOverheadRatio configurable 
 Added bulk get to AbstractAppendingLongBuffer classes, for faster retrieval.
 Introduced a new variant, AppendingPackedLongBuffer which solely relies on 
 PackedInts as a back-end. This new class is useful where people have 
 non-negative numbers with a fairly uniform distribution over a fixed 
 (limited) range. Ex. facets ordinals.
 To distinguish it from AppendingPackedLongBuffer, delta based 
 AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer
 Fixed an Issue with NullReader where it didn't respect it's valueCount in 
 bulk gets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5151) Associations aggregators enter an infinite loop if some documents have no category associations

2013-07-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5151:
---

Attachment: LUCENE-5151.patch

Simple fix. I modified the test to insert some empty documents in the middle. 
I'll commit shortly.

 Associations aggregators enter an infinite loop if some documents have no 
 category associations
 ---

 Key: LUCENE-5151
 URL: https://issues.apache.org/jira/browse/LUCENE-5151
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5151.patch


 Stupid error, they do this:
 {code}
 if (bytes.length == 0) {
   continue;
 }
 {code}
 Since they don't advance 'doc', the hang on that if forever. I'll post a fix 
 soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-07-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723853#comment-13723853
 ] 

Mark Miller commented on SOLR-5081:
---

This is likely the same issue that has come up before - and it has nothing to 
do with cloudsolrserver - its more likely how we limit the number of threads 
that are used to forward on updates- and the nodes can talk back and forth to 
each other, run out of threads, and deadlock. It's similar to the distrib 
deadlock issue. It's been a known issue for many months, just have not had a 
chance to look into it closely yet.

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-07-30 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723856#comment-13723856
 ] 

Erick Erickson commented on SOLR-5081:
--

Agreed, although we should be able to see the deadlock on the semaphore that we 
saw before in SolrCmdDistributor in here somewhere, and it's not in the stack 
trace we've seen so far.

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5090) NPE in DirectSpellChecker with alternativeTermCount and mm.

2013-07-30 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723862#comment-13723862
 ] 

Jack Krupansky commented on SOLR-5090:
--

This NPE has a similar signature to an NPE that I filed back in January - 
SOLR-4320.

I filed several NPE's against spellcheck back then, also: SOLR-4366, SOLR-4304, 
SOLR-4399.


 NPE in DirectSpellChecker with alternativeTermCount and mm.
 ---

 Key: SOLR-5090
 URL: https://issues.apache.org/jira/browse/SOLR-5090
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 4.4
 Environment: 4.4.0 1504776 - sarowe - 2013-07-19 02:58:35
Reporter: Markus Jelsma
 Fix For: 5.0, 4.5


 Query with three terms of which one is misspelled and 
 spellcheck.alternativeTermCount=0mm=3 yields the following NPE:
 {code}
 ERROR org.apache.solr.servlet.SolrDispatchFilter  – 
 null:java.lang.NullPointerException
 at 
 org.apache.lucene.search.spell.DirectSpellChecker.suggestSimilar(DirectSpellChecker.java:422)
 at 
 org.apache.lucene.search.spell.DirectSpellChecker.suggestSimilar(DirectSpellChecker.java:355)
 at 
 org.apache.solr.spelling.DirectSolrSpellChecker.getSuggestions(DirectSolrSpellChecker.java:189)
 at 
 org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:188)
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5151) Associations aggregators enter an infinite loop if some documents have no category associations

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723869#comment-13723869
 ] 

ASF subversion and git services commented on LUCENE-5151:
-

Commit 1508440 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1508440 ]

LUCENE-5151: Associations aggregators enter an infinite loop if some documents 
have no category associations

 Associations aggregators enter an infinite loop if some documents have no 
 category associations
 ---

 Key: LUCENE-5151
 URL: https://issues.apache.org/jira/browse/LUCENE-5151
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5151.patch


 Stupid error, they do this:
 {code}
 if (bytes.length == 0) {
   continue;
 }
 {code}
 Since they don't advance 'doc', the hang on that if forever. I'll post a fix 
 soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1508423 - in /lucene/dev/trunk/lucene: ./ core/src/java/org/apache/lucene/index/ core/src/java/org/apache/lucene/util/ core/src/java/org/apache/lucene/util/packed/ core/src/test/org/a

2013-07-30 Thread Robert Muir
On Tue, Jul 30, 2013 at 8:42 AM, jpou...@apache.org wrote:

 Author: jpountz
 Date: Tue Jul 30 12:42:39 2013
 New Revision: 1508423

 URL: http://svn.apache.org/r1508423
 Log:
 LUCENE-5145: AppendingPackedLongBuffer and added suport for bulk get
 operations to the Appending*Buffers.

 Introduced bulk retrieval to AbstractAppendingLongBuffer
 classes, for faster retrieval. Introduced a new variant,
 AppendingPackedLongBuffer which solely relies on PackedInts as a backend.
 This new class is useful where people have non-negative numbers with a
 uniform distribution over a fixed (limited) range. Ex. facets ordinals. To
 distinguish it from AppendingPackedLongBuffer, delta based
 AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer Fixed an
 Issue with NullReader where it didn't respect it's valueCount in bulk gets.


Do you think we should remove 'Long' from this class to help with the name?
e.g. can it just be AppendingPackedBuffer? (or something shorter)


[jira] [Commented] (SOLR-4981) BasicDistributedZkTest fails on FreeBSD jenkins due to thread leak.

2013-07-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723872#comment-13723872
 ] 

Mark Miller commented on SOLR-4981:
---

I think that may have worked.

 BasicDistributedZkTest fails on FreeBSD jenkins due to thread leak.
 ---

 Key: SOLR-4981
 URL: https://issues.apache.org/jira/browse/SOLR-4981
 Project: Solr
  Issue Type: Test
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5152) Lucene FST is not immutale

2013-07-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5152:


Attachment: LUCENE-5152.patch

here is a patch with the assert and a nocommit in MemoryPostingsFormat

 Lucene FST is not immutale
 --

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5151) Associations aggregators enter an infinite loop if some documents have no category associations

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723881#comment-13723881
 ] 

ASF subversion and git services commented on LUCENE-5151:
-

Commit 1508451 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1508451 ]

LUCENE-5151: Associations aggregators enter an infinite loop if some documents 
have no category associations

 Associations aggregators enter an infinite loop if some documents have no 
 category associations
 ---

 Key: LUCENE-5151
 URL: https://issues.apache.org/jira/browse/LUCENE-5151
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5151.patch


 Stupid error, they do this:
 {code}
 if (bytes.length == 0) {
   continue;
 }
 {code}
 Since they don't advance 'doc', the hang on that if forever. I'll post a fix 
 soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-07-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723878#comment-13723878
 ] 

Mark Miller commented on SOLR-5081:
---

bq. the stack trace we've seen so far.

Those traces are suspect for the problem described I think. Regardless, for 
this type of thing, it would be great to get the traces from a couple machines 
rather than just one.

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5152) Lucene FST is not immutale

2013-07-30 Thread Simon Willnauer (JIRA)
Simon Willnauer created LUCENE-5152:
---

 Summary: Lucene FST is not immutale
 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5
 Attachments: LUCENE-5152.patch

a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
output from and FST (BytesRef) which caused sideffects in later execution. 

I added an assertion into the FST that checks if a cached root arc is modified 
and in-fact this happens for instance in our MemoryPostingsFormat and I bet we 
find more places. We need to think about how to make this less trappy since it 
can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5151) Associations aggregators enter an infinite loop if some documents have no category associations

2013-07-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-5151.


   Resolution: Fixed
Fix Version/s: 4.5
   5.0

Committed.

 Associations aggregators enter an infinite loop if some documents have no 
 category associations
 ---

 Key: LUCENE-5151
 URL: https://issues.apache.org/jira/browse/LUCENE-5151
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5151.patch


 Stupid error, they do this:
 {code}
 if (bytes.length == 0) {
   continue;
 }
 {code}
 Since they don't advance 'doc', the hang on that if forever. I'll post a fix 
 soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5149) CommonTermsQuery should allow minNrShouldMatch for high low freq terms

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723886#comment-13723886
 ] 

ASF subversion and git services commented on LUCENE-5149:
-

Commit 1508453 from [~simonw] in branch 'dev/trunk'
[ https://svn.apache.org/r1508453 ]

LUCENE-5149: CommonTermsQuery should allow minNrShouldMatch for high  low freq 
terms

 CommonTermsQuery should allow minNrShouldMatch for high  low freq terms
 

 Key: LUCENE-5149
 URL: https://issues.apache.org/jira/browse/LUCENE-5149
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 4.4
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5149.patch


 Currently CommonTermsQuery only allows a minShouldMatch for the low frequent 
 query. Yet, we should also allow this for the high frequent part to have 
 better control over scoring. here is an ES issue that is related to this:
 https://github.com/elasticsearch/elasticsearch/issues/3188

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5149) CommonTermsQuery should allow minNrShouldMatch for high low freq terms

2013-07-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-5149.
-

Resolution: Fixed

 CommonTermsQuery should allow minNrShouldMatch for high  low freq terms
 

 Key: LUCENE-5149
 URL: https://issues.apache.org/jira/browse/LUCENE-5149
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 4.4
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5149.patch


 Currently CommonTermsQuery only allows a minShouldMatch for the low frequent 
 query. Yet, we should also allow this for the high frequent part to have 
 better control over scoring. here is an ES issue that is related to this:
 https://github.com/elasticsearch/elasticsearch/issues/3188

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5149) CommonTermsQuery should allow minNrShouldMatch for high low freq terms

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723889#comment-13723889
 ] 

ASF subversion and git services commented on LUCENE-5149:
-

Commit 1508455 from [~simonw] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1508455 ]

LUCENE-5149: CommonTermsQuery should allow minNrShouldMatch for high  low freq 
terms

 CommonTermsQuery should allow minNrShouldMatch for high  low freq terms
 

 Key: LUCENE-5149
 URL: https://issues.apache.org/jira/browse/LUCENE-5149
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 4.4
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5149.patch


 Currently CommonTermsQuery only allows a minShouldMatch for the low frequent 
 query. Yet, we should also allow this for the high frequent part to have 
 better control over scoring. here is an ES issue that is related to this:
 https://github.com/elasticsearch/elasticsearch/issues/3188

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutale

2013-07-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723890#comment-13723890
 ] 

Robert Muir commented on LUCENE-5152:
-

So its really just a BytesRef bug right? Because root arcs cache uses copyFrom, 
but this does a shallow copy of the output/nextFinalOutput, and in this case 
they are pointing to the same bytes (which gives someone the chance to muck 
with them).

 Lucene FST is not immutale
 --

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Shai Erera (JIRA)
Shai Erera created LUCENE-5153:
--

 Summary: Allow wrapping Reader from AnalyzerWrapper
 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera


It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5153:
---

Attachment: LUCENE-5153.patch

simple patch

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutale

2013-07-30 Thread Han Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723922#comment-13723922
 ] 

Han Jiang commented on LUCENE-5152:
---

bq. So its really just a BytesRef bug right? 
+1, so tricky

 Lucene FST is not immutale
 --

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723926#comment-13723926
 ] 

ASF subversion and git services commented on SOLR-5087:
---

Commit 1508476 from [~erickoerickson] in branch 'dev/trunk'
[ https://svn.apache.org/r1508476 ]

SOLR-5087, CoreAdminHandler.handleMergeAction generating NullPointerException. 
Thanks Patrick

 CoreAdminHandler.handleMergeAction generating NullPointerException
 --

 Key: SOLR-5087
 URL: https://issues.apache.org/jira/browse/SOLR-5087
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Patrick Hunt
Assignee: Erick Erickson
 Fix For: 5.0, 4.5

 Attachments: SOLR-5087.patch


 CoreAdminHandler.handleMergeAction is generating NullPointerException
 If directoryFactory.get(...) in handleMergeAction throws an exception the 
 original error is lost as the finally clause will attempt to clean up and 
 generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls 
 that are not filled in)
 {noformat}
 ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException
 at 
 org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723936#comment-13723936
 ] 

Shai Erera commented on LUCENE-5153:


If there are no objections, I'll commit it.

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723941#comment-13723941
 ] 

Robert Muir commented on LUCENE-5153:
-

One odd thing is that wrapComponents adds to the end of the TokenStream chain, 
but with this patch wrapReader inserts into the beginning of the charfilter 
chain.

Not saying its wrong, but is it the right thing?

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-4696) All threads become blocked resulting in hang when bulk adding

2013-07-30 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson closed SOLR-4696.


Resolution: Duplicate

I'm pretty sure this is a duplicate of SOLR-5081, we can re-open if not.

 All threads become blocked resulting in hang when bulk adding
 -

 Key: SOLR-4696
 URL: https://issues.apache.org/jira/browse/SOLR-4696
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.1, 4.2, 4.2.1
 Environment: Ubuntu 12.04.2 LTS 3.5.0-27-generic
 Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)
 KVM, 4xCPU, 5GB RAM, 4GB heap.
 4 cores, 2 shards, 2 nodes, tomcat7
Reporter: matt knecht
  Labels: hang
 Attachments: screenshot-1.jpg, solrconfig.xml, solr.jstack.1, 
 solr.jstack.2


 During a bulk load after about 150,000 documents load, thread usage spikes, 
 solr no longer processes any documents.  Any additional documents added 
 result in a new thread until the pool is exhausted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723958#comment-13723958
 ] 

ASF subversion and git services commented on SOLR-5087:
---

Commit 1508491 from [~erickoerickson] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1508491 ]

SOLR-5087, CoreAdminHandler.handleMergeAction generating NullPointerException. 
Thanks Patrick

 CoreAdminHandler.handleMergeAction generating NullPointerException
 --

 Key: SOLR-5087
 URL: https://issues.apache.org/jira/browse/SOLR-5087
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Patrick Hunt
Assignee: Erick Erickson
 Fix For: 5.0, 4.5

 Attachments: SOLR-5087.patch


 CoreAdminHandler.handleMergeAction is generating NullPointerException
 If directoryFactory.get(...) in handleMergeAction throws an exception the 
 original error is lost as the finally clause will attempt to clean up and 
 generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls 
 that are not filled in)
 {noformat}
 ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException
 at 
 org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException

2013-07-30 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-5087.
--

Resolution: Fixed

Thanks Patrick! I forgot CHANGES.txt, I'll add shortly.

 CoreAdminHandler.handleMergeAction generating NullPointerException
 --

 Key: SOLR-5087
 URL: https://issues.apache.org/jira/browse/SOLR-5087
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Patrick Hunt
Assignee: Erick Erickson
 Fix For: 5.0, 4.5

 Attachments: SOLR-5087.patch


 CoreAdminHandler.handleMergeAction is generating NullPointerException
 If directoryFactory.get(...) in handleMergeAction throws an exception the 
 original error is lost as the finally clause will attempt to clean up and 
 generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls 
 that are not filled in)
 {noformat}
 ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException
 at 
 org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723961#comment-13723961
 ] 

ASF subversion and git services commented on SOLR-5087:
---

Commit 1508494 from [~erickoerickson] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1508494 ]

Added entry for SOLR-5087

 CoreAdminHandler.handleMergeAction generating NullPointerException
 --

 Key: SOLR-5087
 URL: https://issues.apache.org/jira/browse/SOLR-5087
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Patrick Hunt
Assignee: Erick Erickson
 Fix For: 5.0, 4.5

 Attachments: SOLR-5087.patch


 CoreAdminHandler.handleMergeAction is generating NullPointerException
 If directoryFactory.get(...) in handleMergeAction throws an exception the 
 original error is lost as the finally clause will attempt to clean up and 
 generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls 
 that are not filled in)
 {noformat}
 ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException
 at 
 org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723992#comment-13723992
 ] 

Adrien Grand commented on LUCENE-5153:
--

I think this is the right thing? On the opposite, if wrapReader inserted char 
filters at the end of the charfilter chain, the behavior of the wrapper 
analyzer would be altered (it would allow to insert something between the first 
CharFilter and the last TokenFilter of the wrapped analyzer).

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-07-30 Thread Mike Schrag (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723996#comment-13723996
 ] 

Mike Schrag commented on SOLR-5081:
---

I'll kill it again today and grab traces from a few of the nodes.

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5091) Clean up Servlets APIs, Kill SolrDispatchFilter, simplify API creation

2013-07-30 Thread Grant Ingersoll (JIRA)
Grant Ingersoll created SOLR-5091:
-

 Summary: Clean up Servlets APIs, Kill SolrDispatchFilter, simplify 
API creation
 Key: SOLR-5091
 URL: https://issues.apache.org/jira/browse/SOLR-5091
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
 Fix For: 5.0


This is an issue to track a series of sub issues related to deprecated and 
crufty Servlet/REST API code.  I'll create sub-tasks to manage them.

# Clean up all the old UI stuff (old redirects)
# Kill/Simplify SolrDispatchFilter -- for instance, why not make the user 
always have a core name in 5.0?  i.e. /collection1 is the default core
## I'd like to move to just using Guice's servlet extension to do this, which, 
I think will also make it easier to run Solr in other containers (i.e. 
non-servlet environments) due to the fact that you don't have to tie the 
request handling logic specifically to a Servlet.
# Simplify the creation and testing of REST and other APIs via Guice + Restlet, 
which I've done on a number of occasions.
## It might be also possible to move all of the APIs onto Restlet and maintain 
back compat through a simple restlet proxy (still exploring this).  This would 
also have the benefit of abstracting the core request processing out of the 
Servlet context and make that an implementation detail.
## Moving to Guice, IMO, will make it easier to isolate and test individual 
components by being able to inject mocks easier.

I am close to a working patch for some of this.  I will post incremental 
updates/issues as I move forward on this, but I think we should take 5.x as an 
opportunity to be more agnostic of container and I believe the approach I have 
in mind will do so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5140) Slowdown of the span queries caused by LUCENE-4946

2013-07-30 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724003#comment-13724003
 ] 

Adrien Grand commented on LUCENE-5140:
--

I will commit the patch as-is soon and have a look at the lucenebench reports 
in the next days if there is no objection.

 Slowdown of the span queries caused by LUCENE-4946
 --

 Key: LUCENE-5140
 URL: https://issues.apache.org/jira/browse/LUCENE-5140
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-5140.patch


 [~romseygeek] noticed that span queries have been slower since LUCENE-4946 
 got committed.
 http://people.apache.org/~mikemccand/lucenebench/SpanNear.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2580) Create Components to Support Using Business Rules in Solr

2013-07-30 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved SOLR-2580.
---

Resolution: Won't Fix

 Create Components to Support Using Business Rules in Solr
 -

 Key: SOLR-2580
 URL: https://issues.apache.org/jira/browse/SOLR-2580
 Project: Solr
  Issue Type: New Feature
  Components: Rules
Reporter: Tomás Fernández Löbbe
Assignee: Grant Ingersoll
 Fix For: 5.0, 4.5


 The goal is to be able to adjust the relevance of documents based on user 
 defined business rules.
 For example, in a e-commerce site, when the user chooses the shoes 
 category, we may be interested in boosting products from a certain brand. 
 This can be expressed as a rule in the following way:
 rule Boost Adidas products when searching shoes
 when
 $qt : QueryTool()
 TermQuery(term.field==category, term.text==shoes)
 then
 $qt.boost({!lucene}brand:adidas);
 end
 The QueryTool object should be used to alter the main query in a easy way. 
 Even more human-like rules can be written:
 rule Boost Adidas products when searching shoes
  when
 Query has term shoes in field product
  then
 Add boost query {!lucene}brand:adidas
 end
 These rules are written in a text file in the config directory and can be 
 modified at runtime. Rules will be managed using JBoss Drools: 
 http://www.jboss.org/drools/drools-expert.html
 On a first stage, it will allow to add boost queries or change sorting fields 
 based on the user query, but it could be extended to allow more options.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-1004) Create Lucene-Patch Build capability in Hudson

2013-07-30 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved LUCENE-1004.
-

Resolution: Won't Fix

 Create Lucene-Patch Build capability in Hudson
 --

 Key: LUCENE-1004
 URL: https://issues.apache.org/jira/browse/LUCENE-1004
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor

 This issue will be used to test the creation of a Lucene-Patch capability 
 in Hudson that automatically applies submitted Patches (when the Patch 
 Available) flag is checked and then marks the issue with a +/- 1 to the bug 
 so that committers know whether it works or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-5091) Clean up Servlets APIs, Kill SolrDispatchFilter, simplify API creation

2013-07-30 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll reassigned SOLR-5091:
-

Assignee: Grant Ingersoll

 Clean up Servlets APIs, Kill SolrDispatchFilter, simplify API creation
 --

 Key: SOLR-5091
 URL: https://issues.apache.org/jira/browse/SOLR-5091
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Fix For: 5.0


 This is an issue to track a series of sub issues related to deprecated and 
 crufty Servlet/REST API code.  I'll create sub-tasks to manage them.
 # Clean up all the old UI stuff (old redirects)
 # Kill/Simplify SolrDispatchFilter -- for instance, why not make the user 
 always have a core name in 5.0?  i.e. /collection1 is the default core
 ## I'd like to move to just using Guice's servlet extension to do this, 
 which, I think will also make it easier to run Solr in other containers (i.e. 
 non-servlet environments) due to the fact that you don't have to tie the 
 request handling logic specifically to a Servlet.
 # Simplify the creation and testing of REST and other APIs via Guice + 
 Restlet, which I've done on a number of occasions.
 ## It might be also possible to move all of the APIs onto Restlet and 
 maintain back compat through a simple restlet proxy (still exploring this).  
 This would also have the benefit of abstracting the core request processing 
 out of the Servlet context and make that an implementation detail.
 ## Moving to Guice, IMO, will make it easier to isolate and test individual 
 components by being able to inject mocks easier.
 I am close to a working patch for some of this.  I will post incremental 
 updates/issues as I move forward on this, but I think we should take 5.x as 
 an opportunity to be more agnostic of container and I believe the approach I 
 have in mind will do so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2951) Augment QueryElevationComponent results

2013-07-30 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved SOLR-2951.
---

Resolution: Won't Fix

DocTransformers

 Augment QueryElevationComponent results
 ---

 Key: SOLR-2951
 URL: https://issues.apache.org/jira/browse/SOLR-2951
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor

 It would be nice if, in the elevate.xml, you could add fields for the docs 
 that get added to, or modify, the document being returned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-07-30 Thread Han Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Han Jiang updated LUCENE-3069:
--

Attachment: LUCENE-5152.patch

Previous design put much stress on decoding of Outputs. 
This becomes disaster for wildcard queries: like for f*nd, 
we usually have to walk to the last character of FST, then
find that it is not 'd' and automaton doesn't accept this.
In this case, TempFST is actually iterating all the result
of f*, which decodes all the metadata for them...

So I'm trying another approach, the main idea is to load 
metadata  stats as lazily as possible. 
Here I use FSTLong as term index, and leave all other stuff 
in a single term block. The term index FST holds the relationship 
between Term, Ord, and in the term block we can maintain a skip list
for find related metadata  stats.

It is a little similar to BTTR now, and we can someday control how much
data to keep memory resident (e.g. keep stats in memory but metadata on 
disk, however this should be another issue).
Another good part is, it naturally supports seek by ord.(ah, 
actually I don't understand where it is used).

Tests pass, and intersect is not implemented yet.
perf based on 1M wiki data, between non-intersect TempFST and TempFSTOrd:

{noformat}
TaskQPS base  StdDevQPS comp  StdDev
Pct diff
PKLookup  373.80  (0.0%)  320.30  (0.0%)  
-14.3% ( -14% -  -14%)
  Fuzzy1   43.82  (0.0%)   47.10  (0.0%)
7.5% (   7% -7%)
 Prefix3  399.62  (0.0%)  433.95  (0.0%)
8.6% (   8% -8%)
  Fuzzy2   14.26  (0.0%)   15.95  (0.0%)   
11.9% (  11% -   11%)
 Respell   40.69  (0.0%)   46.29  (0.0%)   
13.8% (  13% -   13%)
Wildcard   83.44  (0.0%)   96.54  (0.0%)   
15.7% (  15% -   15%)
{noformat}

perf hit on pklookup should be sane, since I haven't optimize the skip list.

I'll update intersect() later, and later we'll cutover to 
PagedBytes  PackedLongBuffer.


 Lucene should have an entirely memory resident term dictionary
 --

 Key: LUCENE-3069
 URL: https://issues.apache.org/jira/browse/LUCENE-3069
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/search
Affects Versions: 4.0-ALPHA
Reporter: Simon Willnauer
Assignee: Han Jiang
  Labels: gsoc2013
 Fix For: 5.0, 4.5

 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-5152.patch


 FST based TermDictionary has been a great improvement yet it still uses a 
 delta codec file for scanning to terms. Some environments have enough memory 
 available to keep the entire FST based term dict in memory. We should add a 
 TermDictionary implementation that encodes all needed information for each 
 term into the FST (custom fst.Output) and builds a FST from the entire term 
 not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2965) Support Landing Pages/Redirects

2013-07-30 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved SOLR-2965.
---

Resolution: Won't Fix

 Support Landing Pages/Redirects
 ---

 Key: SOLR-2965
 URL: https://issues.apache.org/jira/browse/SOLR-2965
 Project: Solr
  Issue Type: New Feature
  Components: Rules
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor

 In some cases, it is useful for the search engine to bypass doing any search 
 at all and simply return a result indicating the user should be redirected to 
 a landing page.  Initial thinking on implementation is to add a key/value to 
 the header and return no results.  This could be implemented in the 
 QueryElevationComponent (or it's extension, see SOLR-2580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4024) DebugComponent enhancement to report on what documents are potentially missing fields

2013-07-30 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved SOLR-4024.
---

Resolution: Won't Fix

StatsComponent can do this

 DebugComponent enhancement to report on what documents are potentially 
 missing fields
 -

 Key: SOLR-4024
 URL: https://issues.apache.org/jira/browse/SOLR-4024
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 5.0, 4.5


 It's often handy when debugging to know when a document is missing a field 
 that is either searched against or in the schema

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-07-30 Thread Han Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Han Jiang updated LUCENE-3069:
--

Attachment: LUCENE-3069.patch

 Lucene should have an entirely memory resident term dictionary
 --

 Key: LUCENE-3069
 URL: https://issues.apache.org/jira/browse/LUCENE-3069
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/search
Affects Versions: 4.0-ALPHA
Reporter: Simon Willnauer
Assignee: Han Jiang
  Labels: gsoc2013
 Fix For: 5.0, 4.5

 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch


 FST based TermDictionary has been a great improvement yet it still uses a 
 delta codec file for scanning to terms. Some environments have enough memory 
 available to keep the entire FST based term dict in memory. We should add a 
 TermDictionary implementation that encodes all needed information for each 
 term into the FST (custom fst.Output) and builds a FST from the entire term 
 not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-07-30 Thread Han Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Han Jiang updated LUCENE-3069:
--

Attachment: (was: LUCENE-5152.patch)

 Lucene should have an entirely memory resident term dictionary
 --

 Key: LUCENE-3069
 URL: https://issues.apache.org/jira/browse/LUCENE-3069
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/search
Affects Versions: 4.0-ALPHA
Reporter: Simon Willnauer
Assignee: Han Jiang
  Labels: gsoc2013
 Fix For: 5.0, 4.5

 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch


 FST based TermDictionary has been a great improvement yet it still uses a 
 delta codec file for scanning to terms. Some environments have enough memory 
 available to keep the entire FST based term dict in memory. We should add a 
 TermDictionary implementation that encodes all needed information for each 
 term into the FST (custom fst.Output) and builds a FST from the entire term 
 not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4951) randomize merge policy testing in solr

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724052#comment-13724052
 ] 

ASF subversion and git services commented on SOLR-4951:
---

Commit 1508521 from hoss...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1508521 ]

SOLR-4951: Better randomization of MergePolicy in Solr tests

 randomize merge policy testing in solr
 --

 Key: SOLR-4951
 URL: https://issues.apache.org/jira/browse/SOLR-4951
 Project: Solr
  Issue Type: Sub-task
Reporter: Hoss Man
 Attachments: SOLR-4951.patch


 split off from SOLR-4942...
 * add a new RandomMergePolicy that implements MergePolicy by proxying to 
 another instance selected at creation using one of the 
 LuceneTestCase.new...MergePolicy methods
 * updated test configs to refer to this new MergePolicy
 * borrow the tests.shardhandler.randomSeed logic in SolrTestCaseJ4 to give 
 our RandomMergePolicy a consistent seed at runtime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724119#comment-13724119
 ] 

Shai Erera commented on LUCENE-5153:


bq. I think this is the right thing?

I tend to agree. If by wrapping we look at the wrapped object as a black box, 
then we should only allow intervention on its fronts -- before its char filters 
and after its token stream.

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724124#comment-13724124
 ] 

Robert Muir commented on LUCENE-5153:
-

Sounds good to me! +1 to the patch, though we might want to add a test.

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724131#comment-13724131
 ] 

Hoss Man commented on LUCENE-5153:
--

FWIW: a recent thread on this very point...

http://mail-archives.apache.org/mod_mbox/lucene-java-user/201306.mbox/%3cad079bd2-e01e-4e00-b8f6-17594b6c4...@likeness.com%3E

+1 to the wrapReader semantics in the patch.

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4951) randomize merge policy testing in solr

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724154#comment-13724154
 ] 

ASF subversion and git services commented on SOLR-4951:
---

Commit 1508552 from hoss...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1508552 ]

SOLR-4951: Better randomization of MergePolicy in Solr tests (merge r1508521)

 randomize merge policy testing in solr
 --

 Key: SOLR-4951
 URL: https://issues.apache.org/jira/browse/SOLR-4951
 Project: Solr
  Issue Type: Sub-task
Reporter: Hoss Man
 Attachments: SOLR-4951.patch


 split off from SOLR-4942...
 * add a new RandomMergePolicy that implements MergePolicy by proxying to 
 another instance selected at creation using one of the 
 LuceneTestCase.new...MergePolicy methods
 * updated test configs to refer to this new MergePolicy
 * borrow the tests.shardhandler.randomSeed logic in SolrTestCaseJ4 to give 
 our RandomMergePolicy a consistent seed at runtime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4951) randomize merge policy testing in solr

2013-07-30 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-4951.


   Resolution: Fixed
Fix Version/s: 4.5
   5.0
 Assignee: Hoss Man

r1508521  r1508552

 randomize merge policy testing in solr
 --

 Key: SOLR-4951
 URL: https://issues.apache.org/jira/browse/SOLR-4951
 Project: Solr
  Issue Type: Sub-task
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 5.0, 4.5

 Attachments: SOLR-4951.patch


 split off from SOLR-4942...
 * add a new RandomMergePolicy that implements MergePolicy by proxying to 
 another instance selected at creation using one of the 
 LuceneTestCase.new...MergePolicy methods
 * updated test configs to refer to this new MergePolicy
 * borrow the tests.shardhandler.randomSeed logic in SolrTestCaseJ4 to give 
 our RandomMergePolicy a consistent seed at runtime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4221) Custom sharding

2013-07-30 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-4221:
-

Attachment: SOLR-4221.patch

OverseerCollectionProcessor test errors fixed

 Custom sharding
 ---

 Key: SOLR-4221
 URL: https://issues.apache.org/jira/browse/SOLR-4221
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
Assignee: Noble Paul
 Attachments: SOLR-4221.patch, SOLR-4221.patch, SOLR-4221.patch


 Features to let users control everything about sharding/routing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-5080) Ability to Configure Expirable Caches (use Google Collections - MapMaker/CacheBuilder for SolrCache)

2013-07-30 Thread Kranti Parisa
Agree with you, we do have unique identifiers for the 5 min windows in
terms of window start time. Just wanted to tell GC to clean up un-used
caches instead of having them on the JVM. So that we may use JVM/RAM for
serving more queries as it would have more free memory.

Do you have any suggestions for the common JVM settings while using Solr
(of course the values depends on the actual use case) something similar to
http://jprante.github.io/2012/11/28/Elasticsearch-Java-Virtual-Machine-settings-explained.html?

Thanks  Regards,
Kranti K Parisa
http://www.linkedin.com/in/krantiparisa



On Sun, Jul 28, 2013 at 6:42 PM, Erick Erickson erickerick...@gmail.comwrote:

 I'd certainly do that before trying to have a custom cache policy. Measure,
 _then_ fix. If you have your autowarm parameters set up, when your
 searchers come up you'll get good responses on your queries.

 Of course that will put some load on the machine, but find out whether
 the load is noticeable before you make the switch.

 Or be really cheap. For the 5 minute interval, tack on some kind of
 meaningless
 value to the FQ that doesn't change the effect. Then change that every 5
 minutes
 and your old fq cache entries won't be re-used and will age out as time
 passes.

 FWIW,
 Erick

 On Sun, Jul 28, 2013 at 12:37 PM, Kranti Parisa (JIRA) j...@apache.org
 wrote:
 
  [
 https://issues.apache.org/jira/browse/SOLR-5080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721993#comment-13721993]
 
  Kranti Parisa commented on SOLR-5080:
  -
 
  Sure, new searcher will invalidate the caches. But the use cases are we
 don't want to expire the other caches than FilterCache. And for us the
 filters are time bounded, for every 5 minutes the availability changes. I
 am trying to set up a multi core environment and use joins (with FQ).
 Replication happens for every 30 min. If we open a new searcher for every 5
 min, then all the other caches are also invalidated and during runtime it
 may cost us to rebuild those caches. Instead of that, the idea is to have a
 facility to configure the FilterCaches with 5 min expiration policy on one
 of the cores (where availability changes every 5 min) so that we can
 maintain the JVM sizes which will also be an imp factor on high load.
 
  So, you suggest to open new searcher which will invalid all the caches
 on the specific core?
 
  Ability to Configure Expirable Caches (use Google Collections -
 MapMaker/CacheBuilder for SolrCache)
 
 
 
  Key: SOLR-5080
  URL: https://issues.apache.org/jira/browse/SOLR-5080
  Project: Solr
   Issue Type: New Feature
 Reporter: Kranti Parisa
 
  We should be able to configure the expirable caches, especially for
 filterCaches. In some cases, the filterCaches are not valid beyond certain
 time (example 5 minutes).
  Google collections has MapMaker/CacheBuilder which does allow expiration
 
 http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/MapMaker.html
 
 http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/cache/CacheBuilder.html
  SolrCache, LRUCache etc can be implemented with MapMaker or CacheBuilder
 
  --
  This message is automatically generated by JIRA.
  If you think it was sent incorrectly, please contact your JIRA
 administrators
  For more information on JIRA, see:
 http://www.atlassian.com/software/jira
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] [Updated] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5153:
---

Attachment: LUCENE-5153.patch

Added a test to TestingAnalyzers which is under lucene/analysis/common. Is 
there a suitable test under lucene/core?

Also, now that someone can override either components or reader, maybe 
wrapComponents should also not be abstract, returning the passed components? Or 
we make both of them abstract?

Another question (unrelated to this issue) -- why do we need getWrappedAnalyzer 
vs taking the wrapped analyzer in the ctor, like all of our Filter classes do?

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch, LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724230#comment-13724230
 ] 

Robert Muir commented on LUCENE-5153:
-

I dont see the test in the patch... but I think it should be under lucene/core 
(and just wrap MockAnalyzer with MockCharFilter or something).

I think its good to make wrapComponents just return the components as a 
default. This will make PerFieldAnalyzerWrapper look less stupid :)

the getWrappedAnalyzer is explained by its javadocs. You might want a different 
analyzer for different fields.

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch, LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5088) ClassCastException is thrown when trying to use custom SearchHandler.

2013-07-30 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich resolved SOLR-5088.
---

Resolution: Not A Problem

 ClassCastException is thrown when trying to use custom SearchHandler.
 -

 Key: SOLR-5088
 URL: https://issues.apache.org/jira/browse/SOLR-5088
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Pavel Yaskevich

 Hi guys,
   I'm trying to replace solr.SearchHandler to custom one in solrconfig.xml 
 for one of the stores, and it's throwing following exception: 
 {noformat}
 Caused by: org.apache.solr.common.SolrException: RequestHandler init failure
   at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:167)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:772)
   ... 13 more
 Caused by: org.apache.solr.common.SolrException: Error Instantiating Request 
 Handler, org.my.solr.index.CustomSearchHandler failed to instantiate 
 org.apache.solr.request.SolrRequestHandler
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551)
   at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:603)
   at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:153)
   ... 14 more
 Caused by: java.lang.ClassCastException: class 
 org.my.solr.index.CustomSearchHandler
   at java.lang.Class.asSubclass(Class.java:3116)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:381)
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:530)
   ... 16 more
 {noformat}
 I actually tried extending SearchHandler, and implementing SolrRequestHandler 
 as well as extending RequestHandlerBase and it's all the same 
 ClassCastException result...
 org.my.solr.index.CustomSearchHandler is definitely in class path and 
 recompiled every retry. 
 Maybe I'm doing something terribly wrong?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5088) ClassCastException is thrown when trying to use custom SearchHandler.

2013-07-30 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724241#comment-13724241
 ] 

Pavel Yaskevich commented on SOLR-5088:
---

Thanks for the tip [~mkhludnev], putting a handler into war file did help, I'm 
resolving the ticket. Confirmed that I was doing it wrong :)

 ClassCastException is thrown when trying to use custom SearchHandler.
 -

 Key: SOLR-5088
 URL: https://issues.apache.org/jira/browse/SOLR-5088
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Pavel Yaskevich

 Hi guys,
   I'm trying to replace solr.SearchHandler to custom one in solrconfig.xml 
 for one of the stores, and it's throwing following exception: 
 {noformat}
 Caused by: org.apache.solr.common.SolrException: RequestHandler init failure
   at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:167)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:772)
   ... 13 more
 Caused by: org.apache.solr.common.SolrException: Error Instantiating Request 
 Handler, org.my.solr.index.CustomSearchHandler failed to instantiate 
 org.apache.solr.request.SolrRequestHandler
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551)
   at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:603)
   at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:153)
   ... 14 more
 Caused by: java.lang.ClassCastException: class 
 org.my.solr.index.CustomSearchHandler
   at java.lang.Class.asSubclass(Class.java:3116)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:381)
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:530)
   ... 16 more
 {noformat}
 I actually tried extending SearchHandler, and implementing SolrRequestHandler 
 as well as extending RequestHandlerBase and it's all the same 
 ClassCastException result...
 org.my.solr.index.CustomSearchHandler is definitely in class path and 
 recompiled every retry. 
 Maybe I'm doing something terribly wrong?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-07-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724253#comment-13724253
 ] 

Michael McCandless commented on LUCENE-3069:


Wow, those are nice perf results, without implementing intersect!

Intersect really is an optional operation, so we could stop here/now and button 
everything up :)

I like this approach: you moved all the metadata (docFreq, totalTermFreq, 
long[] and byte[] from the PostingsFormatBase) into blocks, and then when we 
really need a term's metadata we go to its block and scan for it (like block 
tree).

I wonder if we could use MonotonicAppendingLongBuffer instead of long[] for the 
in-memory skip data?  Right now it's I think 48 bytes per block (block = 128 
terms), so I guess that's fairly small (.375 bytes per term).

{quote}
It is a little similar to BTTR now, and we can someday control how much
data to keep memory resident (e.g. keep stats in memory but metadata on 
disk, however this should be another issue).
{quote}
That's a nice (future) plus; this way the app can keep only the terms+ords in 
RAM, and leave all term metadata on disk.  But this is definitely optional for 
the project and we should separately explore it ...

{quote}
Another good part is, it naturally supports seek by ord.(ah, 
actually I don't understand where it is used).
{quote}

This is also a nice side-effect!

 Lucene should have an entirely memory resident term dictionary
 --

 Key: LUCENE-3069
 URL: https://issues.apache.org/jira/browse/LUCENE-3069
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/search
Affects Versions: 4.0-ALPHA
Reporter: Simon Willnauer
Assignee: Han Jiang
  Labels: gsoc2013
 Fix For: 5.0, 4.5

 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch


 FST based TermDictionary has been a great improvement yet it still uses a 
 delta codec file for scanning to terms. Some environments have enough memory 
 available to keep the entire FST based term dict in memory. We should add a 
 TermDictionary implementation that encodes all needed information for each 
 term into the FST (custom fst.Output) and builds a FST from the entire term 
 not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724266#comment-13724266
 ] 

Shai Erera commented on LUCENE-5153:


bq. I dont see the test in the patch

Hmm, I was sure I created a new patch. Will upload one soon, after I move the 
test under lucene/core.

bq. I think its good to make wrapComponents just return the components as a 
default.

Ok, will do.

bq. the getWrappedAnalyzer is explained by its javadocs

Duh, I should have read them before. :)

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch, LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 330 - Still Failing

2013-07-30 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/330/

1 tests failed.
FAILED:  org.apache.lucene.util.packed.TestPackedInts.testPackedIntsNull

Error Message:


Stack Trace:
java.lang.AssertionError
at 
__randomizedtesting.SeedInfo.seed([1F00AC77CDB3B5F8:721ABE202E8339ED]:0)
at 
org.apache.lucene.util.packed.PackedInts$NullReader.get(PackedInts.java:709)
at 
org.apache.lucene.util.packed.TestPackedInts.testPackedIntsNull(TestPackedInts.java:556)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:679)




Build Log:
[...truncated 794 lines...]
   [junit4] Suite: org.apache.lucene.util.packed.TestPackedInts
   [junit4]   2 NOTE: download the large Jenkins line-docs file by running 
'ant get-jenkins-line-docs' in the lucene directory.
   [junit4]   2 NOTE: reproduce with: ant test  -Dtestcase=TestPackedInts 
-Dtests.method=testPackedIntsNull -Dtests.seed=1F00AC77CDB3B5F8 
-Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true 
-Dtests.linedocsfile=/home/hudson/lucene-data/enwiki.random.lines.txt 
-Dtests.locale=en_SG -Dtests.timezone=SystemV/EST5 -Dtests.file.encoding=UTF-8
   [junit4] FAILURE 0.10s J1 | 

[jira] [Commented] (LUCENE-5152) Lucene FST is not immutale

2013-07-30 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724280#comment-13724280
 ] 

Simon Willnauer commented on LUCENE-5152:
-

bq. So its really just a BytesRef bug right? 
well in theory that is true. Yet, if you have an arc in your hand you can 
basically change it by passing it to a subsequent call to readNextTargetArc or 
whatever that would override the values completely. BytesRef is tricky but not 
the root cause of this issue. I do think that if you call:

{noformat} 
public ArcT findTargetArc(int labelToMatch, ArcT follow, ArcT arc, 
BytesReader in) throws IOException
{noformat}

it should always fill the arc that is provided so everything you do with it is 
up to you. Aside of this I agree BytesRef is tricky and we should fix if 
possible.

 Lucene FST is not immutale
 --

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5153:
---

Attachment: LUCENE-5153.patch

Patch with discussed fixes and test.

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch, LUCENE-5153.patch, LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b99) - Build # 6723 - Failure!

2013-07-30 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6723/
Java: 32bit/jdk1.8.0-ea-b99 -client -XX:+UseConcMarkSweepGC

2 tests failed.
REGRESSION:  org.apache.solr.core.TestJmxIntegration.testJmxUpdate

Error Message:
No mbean found for SolrIndexSearcher

Stack Trace:
java.lang.AssertionError: No mbean found for SolrIndexSearcher
at 
__randomizedtesting.SeedInfo.seed([F18FE6A79C36BC90:E7E8D4CD0CE0173B]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertFalse(Assert.java:68)
at 
org.apache.solr.core.TestJmxIntegration.testJmxUpdate(TestJmxIntegration.java:120)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:491)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:724)


REGRESSION:  org.apache.solr.core.TestJmxIntegration.testJmxRegistration

Error 

[JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 55775 - Failure!

2013-07-30 Thread builder
Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/55775/

1 tests failed.
REGRESSION:  org.apache.lucene.util.packed.TestPackedInts.testPackedIntsNull

Error Message:


Stack Trace:
java.lang.AssertionError
at 
__randomizedtesting.SeedInfo.seed([E9BA044CFBEFB54B:84A0161B18DF395E]:0)
at 
org.apache.lucene.util.packed.PackedInts$NullReader.get(PackedInts.java:709)
at 
org.apache.lucene.util.packed.TestPackedInts.testPackedIntsNull(TestPackedInts.java:556)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:722)




Build Log:
[...truncated 1325 lines...]
BUILD FAILED
/var/lib/jenkins/workspace/Lucene-trunk-Linux-Java7-64-test-only/checkout/lucene/build.xml:49:
 The following error occurred while executing this line:
/var/lib/jenkins/workspace/Lucene-trunk-Linux-Java7-64-test-only/checkout/lucene/common-build.xml:1230:
 The following error occurred while executing this line:
/var/lib/jenkins/workspace/Lucene-trunk-Linux-Java7-64-test-only/checkout/lucene/common-build.xml:873:
 There were test failures: 363 suites, 2313 tests, 1 failure, 59 ignored (46 
assumptions)

Total time: 4 minutes 15 seconds
Build step 'Invoke Ant' marked 

Re: [JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 330 - Still Failing

2013-07-30 Thread Robert Muir
I committed a fix.

On Tue, Jul 30, 2013 at 3:01 PM, Apache Jenkins Server 
jenk...@builds.apache.org wrote:

 Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/330/

 1 tests failed.
 FAILED:  org.apache.lucene.util.packed.TestPackedInts.testPackedIntsNull

 Error Message:


 Stack Trace:
 java.lang.AssertionError
 at
 __randomizedtesting.SeedInfo.seed([1F00AC77CDB3B5F8:721ABE202E8339ED]:0)
 at
 org.apache.lucene.util.packed.PackedInts$NullReader.get(PackedInts.java:709)
 at
 org.apache.lucene.util.packed.TestPackedInts.testPackedIntsNull(TestPackedInts.java:556)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
 at
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
 at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
 at
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at java.lang.Thread.run(Thread.java:679)




 Build Log:
 [...truncated 794 lines...]
[junit4] Suite: org.apache.lucene.util.packed.TestPackedInts
[junit4]   2 NOTE: download the large Jenkins line-docs file by
 running 'ant get-jenkins-line-docs' in the lucene directory.
[junit4]   2 NOTE: reproduce with: ant test  -Dtestcase=TestPackedInts
 -Dtests.method=testPackedIntsNull -Dtests.seed=1F00AC77CDB3B5F8
 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true
 

[jira] [Updated] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5153:


Attachment: LUCENE-5153.patch

just a tiny improvement to the test (uses basetokenstream assert)

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch, LUCENE-5153.patch, LUCENE-5153.patch, 
 LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5092) Send shard request to multiple replicas

2013-07-30 Thread Isaac Hebsh (JIRA)
Isaac Hebsh created SOLR-5092:
-

 Summary: Send shard request to multiple replicas
 Key: SOLR-5092
 URL: https://issues.apache.org/jira/browse/SOLR-5092
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, SolrCloud
Affects Versions: 4.4
Reporter: Isaac Hebsh
Priority: Minor


We have a case on a SolrCloud cluster. Queries takes too much QTime, due to a 
randomly slow shard request. In a noticeable part of queries, the slowest shard 
consumes more than 4 times qtime than the average.

Of course, deep inspection of the performance factor should be made on the 
specific environment.

But, there is one more idea:

If shard request will be sent to all of the replicas of each shard, the 
probability of all the replicas of the same shard to be the slowest is very 
small. Obviously cluster works harder, but on a (very) low qps, it might be OK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5092) Send shard request to multiple replicas

2013-07-30 Thread Isaac Hebsh (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isaac Hebsh updated SOLR-5092:
--

Attachment: SOLR-5092.patch

Submitting initial patch.
As Erick suggested on mailing list, changes are only in solrj, so nothing 
should be changed in core.

This change should be very easy. Just move the single HTTP request into 
CompletionService :)

But, the most complicated thing in this patch, is to preserve the original 
exception handling. There are some exceptions which are considered as 
temporary, while other exceptions are fatal. Moreover, we want to preserve the 
zombie list maintenance as is.

 Send shard request to multiple replicas
 ---

 Key: SOLR-5092
 URL: https://issues.apache.org/jira/browse/SOLR-5092
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, SolrCloud
Affects Versions: 4.4
Reporter: Isaac Hebsh
Priority: Minor
  Labels: distributed, performance, shard, solrcloud
 Attachments: SOLR-5092.patch


 We have a case on a SolrCloud cluster. Queries takes too much QTime, due to a 
 randomly slow shard request. In a noticeable part of queries, the slowest 
 shard consumes more than 4 times qtime than the average.
 Of course, deep inspection of the performance factor should be made on the 
 specific environment.
 But, there is one more idea:
 If shard request will be sent to all of the replicas of each shard, the 
 probability of all the replicas of the same shard to be the slowest is very 
 small. Obviously cluster works harder, but on a (very) low qps, it might be 
 OK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread David Smiley (JIRA)
David Smiley created SOLR-5093:
--

 Summary: Rewrite field:* to use the filter cache
 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley


Sometimes people writes a query including something like {{field:*}} which 
matches all documents that have an indexed value in that field.  That can be 
particularly expensive for tokenized text, numeric, and spatial fields.  The 
expert advise is to index a separate boolean field that is used in place of 
these query clauses, but that's annoying to do and it can take users a while to 
realize that's what they need to do.

I propose that Solr's query parser rewrite such queries to return a query 
backed by Solr's filter cache.  The underlying query happens once (and it's 
slow this time) and then it's cached after which it's super-fast to reuse.  
Unfortunately Solr's filter cache is currently index global, not per-segment; 
that's being handled in a separate issue.  

Related to this, it may be worth considering if Solr should behind the scenes 
index a field that records which fields have indexed values, and then it could 
use this indexed data to power these queries so they are always fast to 
execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
use this.

For an example of how a user bumped into this, see:
http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724391#comment-13724391
 ] 

ASF subversion and git services commented on LUCENE-5153:
-

Commit 1508622 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1508622 ]

LUCENE-5153:  Allow wrapping Reader from AnalyzerWrapper

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch, LUCENE-5153.patch, LUCENE-5153.patch, 
 LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724393#comment-13724393
 ] 

Robert Muir commented on SOLR-5093:
---

Err, this user already had this in their FQ. So if they had a filtercache, he'd 
be using it.

he should pull that slow piece to a separate FQ so its cached by itself. I 
don't understand why the queryparser needs to do anything else here (especially 
any trappy auto-caching)

 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724397#comment-13724397
 ] 

ASF subversion and git services commented on LUCENE-5153:
-

Commit 1508623 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1508623 ]

LUCENE-5153:  Allow wrapping Reader from AnalyzerWrapper

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5153.patch, LUCENE-5153.patch, LUCENE-5153.patch, 
 LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5092) Send shard request to multiple replicas

2013-07-30 Thread Isaac Hebsh (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724396#comment-13724396
 ] 

Isaac Hebsh commented on SOLR-5092:
---

Question:
in HttpShardHandler, I can find the next comment:
{code}
// maps localhost:8983|localhost:7574 to a shuffled 
List(http://localhost:8983,http://localhost:7574;)
// This is primarily to keep track of what order we should use to query the 
replicas of a shard
// so that we use the same replica for all phases of a distributed request.
shardToURLs = new HashMapString,ListString();
{code}

why is the replica-consistency is so important? what would happen if one phase 
of a distributed request will get a response from replica1 and another phase 
will get a response from replica2?
I think this situation might happen in the current state, if one replica stops 
to response during the distributed process.

 Send shard request to multiple replicas
 ---

 Key: SOLR-5092
 URL: https://issues.apache.org/jira/browse/SOLR-5092
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, SolrCloud
Affects Versions: 4.4
Reporter: Isaac Hebsh
Priority: Minor
  Labels: distributed, performance, shard, solrcloud
 Attachments: SOLR-5092.patch


 We have a case on a SolrCloud cluster. Queries takes too much QTime, due to a 
 randomly slow shard request. In a noticeable part of queries, the slowest 
 shard consumes more than 4 times qtime than the average.
 Of course, deep inspection of the performance factor should be made on the 
 specific environment.
 But, there is one more idea:
 If shard request will be sent to all of the replicas of each shard, the 
 probability of all the replicas of the same shard to be the slowest is very 
 small. Obviously cluster works harder, but on a (very) low qps, it might be 
 OK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-5153.


   Resolution: Fixed
Fix Version/s: 4.5
   5.0

Thanks Rob. I applied your improvement and committed.

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5153.patch, LUCENE-5153.patch, LUCENE-5153.patch, 
 LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5092) Send shard request to multiple replicas

2013-07-30 Thread Isaac Hebsh (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724376#comment-13724376
 ] 

Isaac Hebsh edited comment on SOLR-5092 at 7/30/13 8:51 PM:


Submitting initial patch.
As [~erickoerickson] suggested on mailing list, changes are only in solrj, so 
nothing should be changed in core.

This change should be very easy. Just move the single HTTP request into 
CompletionService :)

But, the most complicated thing in this patch, is to preserve the original 
exception handling. There are some exceptions which are considered as 
temporary, while other exceptions are fatal. Moreover, we want to preserve the 
zombie list maintenance as is.

  was (Author: isaachebsh):
Submitting initial patch.
As Erick suggested on mailing list, changes are only in solrj, so nothing 
should be changed in core.

This change should be very easy. Just move the single HTTP request into 
CompletionService :)

But, the most complicated thing in this patch, is to preserve the original 
exception handling. There are some exceptions which are considered as 
temporary, while other exceptions are fatal. Moreover, we want to preserve the 
zombie list maintenance as is.
  
 Send shard request to multiple replicas
 ---

 Key: SOLR-5092
 URL: https://issues.apache.org/jira/browse/SOLR-5092
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, SolrCloud
Affects Versions: 4.4
Reporter: Isaac Hebsh
Priority: Minor
  Labels: distributed, performance, shard, solrcloud
 Attachments: SOLR-5092.patch


 We have a case on a SolrCloud cluster. Queries takes too much QTime, due to a 
 randomly slow shard request. In a noticeable part of queries, the slowest 
 shard consumes more than 4 times qtime than the average.
 Of course, deep inspection of the performance factor should be made on the 
 specific environment.
 But, there is one more idea:
 If shard request will be sent to all of the replicas of each shard, the 
 probability of all the replicas of the same shard to be the slowest is very 
 small. Obviously cluster works harder, but on a (very) low qps, it might be 
 OK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724405#comment-13724405
 ] 

Jack Krupansky commented on SOLR-5093:
--

Some time ago I had suggested a related approach: LUCENE-4386 - Query parser 
should generate FieldValueFilter for pure wildcard terms to boost query 
performance.

There were objections from the Lucene guys, but now that the Solr query parser 
is divorced from Lucene, maybe it could be reconsidered.

I couldn't testify as to the relative merits of using the filter cache vs. the 
FieldValueFilter.


 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724422#comment-13724422
 ] 

Robert Muir commented on SOLR-5093:
---

Those same lucene guys are not afraid to object here either.

This user just has to pull out AND pp:* into another fq of pp:*

{quote}
(Each filter is executed and cached separately. When it's time to use them to 
limit the number of results returned by a query, this is done using set 
intersections.) 
{quote}
http://wiki.apache.org/solr/SolrCaching#filterCache

 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724428#comment-13724428
 ] 

David Smiley commented on SOLR-5093:


Rob,
You're right for this particular user's use-case that I mentioned.  I 
overlooked that aspect of his query.  Nonetheless, I don't think that negates 
the usefulness of what I propose in this issue though.

If you consider auto-caching trappy then you probably don't like Solr very 
much at all then.

 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724430#comment-13724430
 ] 

Jack Krupansky commented on SOLR-5093:
--

bq. This user just has to pull out AND pp:* into another fq of pp:*

Exactly! That's what we (non-Lucene guys) are trying to do - eliminate the need 
for users to have to do that kind of manual optimization.

We want Solr to behave as optimally as possibly OOTB.


 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724443#comment-13724443
 ] 

Robert Muir commented on SOLR-5093:
---

Solr today doesn't auto-cache. You can specify that you intend for a query to 
act only as a filter with fqs, control the caching behavior of these fqs, and 
so on.

So there is no need to add any additional auto-caching in the queryparser. 
Things like LUCENE-4386 would just cause filter cache insanity where its 
cached in duplicate places (on FieldCache.docsWithField as well as in fq 
bitsets).

Auto-caching things in the query can easily pollute the cache with stuff thats 
not actually intended to be reused: then it doesn't really work at all.

 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724446#comment-13724446
 ] 

Hoss Man commented on SOLR-5093:


I can see the argument for making field:* parse as equivalent to field:[* TO 
*] if the later is in fact more efficient, but i agree with rob that we 
shouldn't try make the parser pull out individual clauses and construct special 
query objects that are baked by the filterCache.  If i have an fq in my 
solrconfig that looks like this...

{noformat}
str name=fqX AND Y AND Z/str
{noformat}

...that entire BooleanQuery should be cached as a single entity in the 
filterCache regardless of what X, Y, and Z really are -- because that's what i 
asked for: a single filter query.

it would suck if the Query Parser looked at the specifics of what each of those 
clauses are and said I'm going to try and be smart and make each of these 
clauses be special query backed by the filterCache because now i have 4 
queries in my filterCache instead of just 1, and 3 of them will never be used.



 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-07-30 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724450#comment-13724450
 ] 

David Smiley commented on LUCENE-3069:
--

Nice work!  The spatial prefix trees will have even more awesome performance 
with all terms in RAM.  It'd be nice if I could configure the docFreq to be 
memory resident but, as Mike said, adding options like that can be explored 
later.

 Lucene should have an entirely memory resident term dictionary
 --

 Key: LUCENE-3069
 URL: https://issues.apache.org/jira/browse/LUCENE-3069
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/search
Affects Versions: 4.0-ALPHA
Reporter: Simon Willnauer
Assignee: Han Jiang
  Labels: gsoc2013
 Fix For: 5.0, 4.5

 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch


 FST based TermDictionary has been a great improvement yet it still uses a 
 delta codec file for scanning to terms. Some environments have enough memory 
 available to keep the entire FST based term dict in memory. We should add a 
 TermDictionary implementation that encodes all needed information for each 
 term into the FST (custom fst.Output) and builds a FST from the entire term 
 not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5153) Allow wrapping Reader from AnalyzerWrapper

2013-07-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724461#comment-13724461
 ] 

Uwe Schindler commented on LUCENE-5153:
---

Thanks!

 Allow wrapping Reader from AnalyzerWrapper
 --

 Key: LUCENE-5153
 URL: https://issues.apache.org/jira/browse/LUCENE-5153
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5153.patch, LUCENE-5153.patch, LUCENE-5153.patch, 
 LUCENE-5153.patch


 It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given 
 to initReader, e.g. with a CharFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >