[jira] [Commented] (LUCENE-6482) Class loading deadlock relating to Codec initialization, default codec and SPI discovery

2015-06-08 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578288#comment-14578288
 ] 

Shikhar Bhushan commented on LUCENE-6482:
-

Thanks for fixing this [~thetaphi]! Great digging on what was going on. The fix 
and the test looks good to me.

 Class loading deadlock relating to Codec initialization, default codec and 
 SPI discovery
 

 Key: LUCENE-6482
 URL: https://issues.apache.org/jira/browse/LUCENE-6482
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/codecs
Affects Versions: 4.9.1
Reporter: Shikhar Bhushan
Assignee: Uwe Schindler
Priority: Critical
 Fix For: Trunk, 5.3, 5.2.1

 Attachments: CodecLoadingDeadlockTest.java, 
 LUCENE-6482-failingtest.patch, LUCENE-6482-failingtest.patch, 
 LUCENE-6482.patch, LUCENE-6482.patch, LUCENE-6482.patch, LUCENE-6482.patch, 
 LUCENE-6482.patch


 This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
 4.9.1), with many threads seeming deadlocked but RUNNABLE:
 {noformat}
 elasticsearch[search77-es2][generic][T#43] #160 daemon prio=5 os_prio=0 
 tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000]
java.lang.Thread.State: RUNNABLE
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
   at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758)
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453)
   at 
 org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98)
   at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126)
   at org.elasticsearch.index.store.Store.access$300(Store.java:76)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.init(Store.java:456)
   at 
 org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268)
   at 
 org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It didn't really make sense to see RUNNABLE threads in Object.wait(), but 
 this seems to be symptomatic of deadlocks in static initialization 
 (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).
 I found LUCENE-5573 as an instance of this having come up with Lucene code 
 before.
 I'm not sure what exactly is going on, but the deadlock in this case seems to 
 involve these threads:
 {noformat}
 elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon 
 prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() 
 [0x7f79daed8000]
java.lang.Thread.State: RUNNABLE
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
   at java.lang.Class.newInstance(Class.java:433)
   at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67)
   - locked 0x00061fef4968 (a org.apache.lucene.util.NamedSPILoader)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:47)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:37)
   at 
 org.apache.lucene.codecs.PostingsFormat.clinit(PostingsFormat.java:44)
   at 
 

[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader

2015-06-06 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-6482:

Attachment: CodecLoadingDeadlockTest.java

 Class loading deadlock relating to NamedSPILoader
 -

 Key: LUCENE-6482
 URL: https://issues.apache.org/jira/browse/LUCENE-6482
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9.1
Reporter: Shikhar Bhushan
Assignee: Uwe Schindler
 Attachments: CodecLoadingDeadlockTest.java


 This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
 4.9.1), with many threads seeming deadlocked but RUNNABLE:
 {noformat}
 elasticsearch[search77-es2][generic][T#43] #160 daemon prio=5 os_prio=0 
 tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000]
java.lang.Thread.State: RUNNABLE
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
   at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758)
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453)
   at 
 org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98)
   at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126)
   at org.elasticsearch.index.store.Store.access$300(Store.java:76)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.init(Store.java:456)
   at 
 org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268)
   at 
 org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It didn't really make sense to see RUNNABLE threads in Object.wait(), but 
 this seems to be symptomatic of deadlocks in static initialization 
 (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).
 I found LUCENE-5573 as an instance of this having come up with Lucene code 
 before.
 I'm not sure what exactly is going on, but the deadlock in this case seems to 
 involve these threads:
 {noformat}
 elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon 
 prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() 
 [0x7f79daed8000]
java.lang.Thread.State: RUNNABLE
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
   at java.lang.Class.newInstance(Class.java:433)
   at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67)
   - locked 0x00061fef4968 (a org.apache.lucene.util.NamedSPILoader)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:47)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:37)
   at 
 org.apache.lucene.codecs.PostingsFormat.clinit(PostingsFormat.java:44)
   at 
 org.elasticsearch.index.codec.postingsformat.PostingFormats.clinit(PostingFormats.java:67)
   at 
 org.elasticsearch.index.codec.CodecModule.configurePostingsFormats(CodecModule.java:126)
   at 
 org.elasticsearch.index.codec.CodecModule.configure(CodecModule.java:178)
   at 
 org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
   - locked 0x00061fef49e8 (a 
 org.elasticsearch.index.codec.CodecModule)
   at 
 

[jira] [Commented] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader

2015-06-06 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575639#comment-14575639
 ] 

Shikhar Bhushan commented on LUCENE-6482:
-

Thanks Uwe. I have actually not had a single occasion of not encountering the 
deadlock, just these lines do the trick every time

{noformat}
  public static void main(String... args) {
final Thread t1 = new Thread(() - Codec.getDefault());
final Thread t2 = new Thread(() - new SimpleTextCodec());

t1.start();
t2.start();
  }
{noformat}

I am using JDK8u25.

 Class loading deadlock relating to NamedSPILoader
 -

 Key: LUCENE-6482
 URL: https://issues.apache.org/jira/browse/LUCENE-6482
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9.1
Reporter: Shikhar Bhushan
Assignee: Uwe Schindler
 Attachments: CodecLoadingDeadlockTest.java


 This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
 4.9.1), with many threads seeming deadlocked but RUNNABLE:
 {noformat}
 elasticsearch[search77-es2][generic][T#43] #160 daemon prio=5 os_prio=0 
 tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000]
java.lang.Thread.State: RUNNABLE
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
   at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758)
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453)
   at 
 org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98)
   at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126)
   at org.elasticsearch.index.store.Store.access$300(Store.java:76)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.init(Store.java:456)
   at 
 org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268)
   at 
 org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It didn't really make sense to see RUNNABLE threads in Object.wait(), but 
 this seems to be symptomatic of deadlocks in static initialization 
 (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).
 I found LUCENE-5573 as an instance of this having come up with Lucene code 
 before.
 I'm not sure what exactly is going on, but the deadlock in this case seems to 
 involve these threads:
 {noformat}
 elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon 
 prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() 
 [0x7f79daed8000]
java.lang.Thread.State: RUNNABLE
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
   at java.lang.Class.newInstance(Class.java:433)
   at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67)
   - locked 0x00061fef4968 (a org.apache.lucene.util.NamedSPILoader)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:47)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:37)
   at 
 org.apache.lucene.codecs.PostingsFormat.clinit(PostingsFormat.java:44)
   at 
 org.elasticsearch.index.codec.postingsformat.PostingFormats.clinit(PostingFormats.java:67)
   at 
 

[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader

2015-06-06 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-6482:

Attachment: (was: CodecLoadingDeadlockTest.java)

 Class loading deadlock relating to NamedSPILoader
 -

 Key: LUCENE-6482
 URL: https://issues.apache.org/jira/browse/LUCENE-6482
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9.1
Reporter: Shikhar Bhushan
Assignee: Uwe Schindler
 Attachments: CodecLoadingDeadlockTest.java


 This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
 4.9.1), with many threads seeming deadlocked but RUNNABLE:
 {noformat}
 elasticsearch[search77-es2][generic][T#43] #160 daemon prio=5 os_prio=0 
 tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000]
java.lang.Thread.State: RUNNABLE
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
   at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758)
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453)
   at 
 org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98)
   at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126)
   at org.elasticsearch.index.store.Store.access$300(Store.java:76)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.init(Store.java:456)
   at 
 org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268)
   at 
 org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It didn't really make sense to see RUNNABLE threads in Object.wait(), but 
 this seems to be symptomatic of deadlocks in static initialization 
 (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).
 I found LUCENE-5573 as an instance of this having come up with Lucene code 
 before.
 I'm not sure what exactly is going on, but the deadlock in this case seems to 
 involve these threads:
 {noformat}
 elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon 
 prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() 
 [0x7f79daed8000]
java.lang.Thread.State: RUNNABLE
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
   at java.lang.Class.newInstance(Class.java:433)
   at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67)
   - locked 0x00061fef4968 (a org.apache.lucene.util.NamedSPILoader)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:47)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:37)
   at 
 org.apache.lucene.codecs.PostingsFormat.clinit(PostingsFormat.java:44)
   at 
 org.elasticsearch.index.codec.postingsformat.PostingFormats.clinit(PostingFormats.java:67)
   at 
 org.elasticsearch.index.codec.CodecModule.configurePostingsFormats(CodecModule.java:126)
   at 
 org.elasticsearch.index.codec.CodecModule.configure(CodecModule.java:178)
   at 
 org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
   - locked 0x00061fef49e8 (a 
 org.elasticsearch.index.codec.CodecModule)
   at 
 

[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader

2015-06-05 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-6482:

Attachment: CodecLoadingDeadlockTest.java

 Class loading deadlock relating to NamedSPILoader
 -

 Key: LUCENE-6482
 URL: https://issues.apache.org/jira/browse/LUCENE-6482
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9.1
Reporter: Shikhar Bhushan
Assignee: Uwe Schindler
 Attachments: CodecLoadingDeadlockTest.java


 This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
 4.9.1), with many threads seeming deadlocked but RUNNABLE:
 {noformat}
 elasticsearch[search77-es2][generic][T#43] #160 daemon prio=5 os_prio=0 
 tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000]
java.lang.Thread.State: RUNNABLE
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
   at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758)
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453)
   at 
 org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98)
   at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126)
   at org.elasticsearch.index.store.Store.access$300(Store.java:76)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.init(Store.java:456)
   at 
 org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268)
   at 
 org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It didn't really make sense to see RUNNABLE threads in Object.wait(), but 
 this seems to be symptomatic of deadlocks in static initialization 
 (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).
 I found LUCENE-5573 as an instance of this having come up with Lucene code 
 before.
 I'm not sure what exactly is going on, but the deadlock in this case seems to 
 involve these threads:
 {noformat}
 elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon 
 prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() 
 [0x7f79daed8000]
java.lang.Thread.State: RUNNABLE
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
   at java.lang.Class.newInstance(Class.java:433)
   at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67)
   - locked 0x00061fef4968 (a org.apache.lucene.util.NamedSPILoader)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:47)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:37)
   at 
 org.apache.lucene.codecs.PostingsFormat.clinit(PostingsFormat.java:44)
   at 
 org.elasticsearch.index.codec.postingsformat.PostingFormats.clinit(PostingFormats.java:67)
   at 
 org.elasticsearch.index.codec.CodecModule.configurePostingsFormats(CodecModule.java:126)
   at 
 org.elasticsearch.index.codec.CodecModule.configure(CodecModule.java:178)
   at 
 org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
   - locked 0x00061fef49e8 (a 
 org.elasticsearch.index.codec.CodecModule)
   at 
 

[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader

2015-06-05 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-6482:

Attachment: (was: CodecLoadingDeadlockTest.java)

 Class loading deadlock relating to NamedSPILoader
 -

 Key: LUCENE-6482
 URL: https://issues.apache.org/jira/browse/LUCENE-6482
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9.1
Reporter: Shikhar Bhushan
Assignee: Uwe Schindler
 Attachments: CodecLoadingDeadlockTest.java


 This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
 4.9.1), with many threads seeming deadlocked but RUNNABLE:
 {noformat}
 elasticsearch[search77-es2][generic][T#43] #160 daemon prio=5 os_prio=0 
 tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000]
java.lang.Thread.State: RUNNABLE
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
   at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758)
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453)
   at 
 org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98)
   at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126)
   at org.elasticsearch.index.store.Store.access$300(Store.java:76)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.init(Store.java:456)
   at 
 org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268)
   at 
 org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It didn't really make sense to see RUNNABLE threads in Object.wait(), but 
 this seems to be symptomatic of deadlocks in static initialization 
 (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).
 I found LUCENE-5573 as an instance of this having come up with Lucene code 
 before.
 I'm not sure what exactly is going on, but the deadlock in this case seems to 
 involve these threads:
 {noformat}
 elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon 
 prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() 
 [0x7f79daed8000]
java.lang.Thread.State: RUNNABLE
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
   at java.lang.Class.newInstance(Class.java:433)
   at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67)
   - locked 0x00061fef4968 (a org.apache.lucene.util.NamedSPILoader)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:47)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:37)
   at 
 org.apache.lucene.codecs.PostingsFormat.clinit(PostingsFormat.java:44)
   at 
 org.elasticsearch.index.codec.postingsformat.PostingFormats.clinit(PostingFormats.java:67)
   at 
 org.elasticsearch.index.codec.CodecModule.configurePostingsFormats(CodecModule.java:126)
   at 
 org.elasticsearch.index.codec.CodecModule.configure(CodecModule.java:178)
   at 
 org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
   - locked 0x00061fef49e8 (a 
 org.elasticsearch.index.codec.CodecModule)
   at 
 

[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader

2015-06-05 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-6482:

Attachment: CodecLoadingDeadlockTest.java

[~thetaphi] I have had some luck reproducing the problem quite consistently 
with the attached test. If you uncomment the first line in the main() so that 
Codec is previously initialized before the threads start, the deadlock doesn't 
happen.

 Class loading deadlock relating to NamedSPILoader
 -

 Key: LUCENE-6482
 URL: https://issues.apache.org/jira/browse/LUCENE-6482
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9.1
Reporter: Shikhar Bhushan
Assignee: Uwe Schindler
 Attachments: CodecLoadingDeadlockTest.java


 This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
 4.9.1), with many threads seeming deadlocked but RUNNABLE:
 {noformat}
 elasticsearch[search77-es2][generic][T#43] #160 daemon prio=5 os_prio=0 
 tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000]
java.lang.Thread.State: RUNNABLE
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
   at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758)
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453)
   at 
 org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98)
   at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126)
   at org.elasticsearch.index.store.Store.access$300(Store.java:76)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.init(Store.java:456)
   at 
 org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268)
   at 
 org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It didn't really make sense to see RUNNABLE threads in Object.wait(), but 
 this seems to be symptomatic of deadlocks in static initialization 
 (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).
 I found LUCENE-5573 as an instance of this having come up with Lucene code 
 before.
 I'm not sure what exactly is going on, but the deadlock in this case seems to 
 involve these threads:
 {noformat}
 elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon 
 prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() 
 [0x7f79daed8000]
java.lang.Thread.State: RUNNABLE
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
   at java.lang.Class.newInstance(Class.java:433)
   at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67)
   - locked 0x00061fef4968 (a org.apache.lucene.util.NamedSPILoader)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:47)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:37)
   at 
 org.apache.lucene.codecs.PostingsFormat.clinit(PostingsFormat.java:44)
   at 
 org.elasticsearch.index.codec.postingsformat.PostingFormats.clinit(PostingFormats.java:67)
   at 
 org.elasticsearch.index.codec.CodecModule.configurePostingsFormats(CodecModule.java:126)
   at 
 

[jira] [Reopened] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader

2015-05-19 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan reopened LUCENE-6482:
-

Reopening as per discussion in 
https://github.com/elastic/elasticsearch/issues/11170

 Class loading deadlock relating to NamedSPILoader
 -

 Key: LUCENE-6482
 URL: https://issues.apache.org/jira/browse/LUCENE-6482
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9.1
Reporter: Shikhar Bhushan
Assignee: Uwe Schindler

 This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
 4.9.1), with many threads seeming deadlocked but RUNNABLE:
 {noformat}
 elasticsearch[search77-es2][generic][T#43] #160 daemon prio=5 os_prio=0 
 tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000]
java.lang.Thread.State: RUNNABLE
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
   at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758)
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453)
   at 
 org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98)
   at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126)
   at org.elasticsearch.index.store.Store.access$300(Store.java:76)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.init(Store.java:456)
   at 
 org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268)
   at 
 org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It didn't really make sense to see RUNNABLE threads in Object.wait(), but 
 this seems to be symptomatic of deadlocks in static initialization 
 (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).
 I found LUCENE-5573 as an instance of this having come up with Lucene code 
 before.
 I'm not sure what exactly is going on, but the deadlock in this case seems to 
 involve these threads:
 {noformat}
 elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon 
 prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() 
 [0x7f79daed8000]
java.lang.Thread.State: RUNNABLE
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
   at java.lang.Class.newInstance(Class.java:433)
   at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67)
   - locked 0x00061fef4968 (a org.apache.lucene.util.NamedSPILoader)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:47)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:37)
   at 
 org.apache.lucene.codecs.PostingsFormat.clinit(PostingsFormat.java:44)
   at 
 org.elasticsearch.index.codec.postingsformat.PostingFormats.clinit(PostingFormats.java:67)
   at 
 org.elasticsearch.index.codec.CodecModule.configurePostingsFormats(CodecModule.java:126)
   at 
 org.elasticsearch.index.codec.CodecModule.configure(CodecModule.java:178)
   at 
 org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
   - locked 0x00061fef49e8 (a 
 org.elasticsearch.index.codec.CodecModule)
   at 
 

[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader

2015-05-14 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-6482:

Description: 
This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
4.9.1), with many threads seeming deadlocked but RUNNABLE:
{noformat}
elasticsearch[search77-es2][generic][T#43] #160 daemon prio=5 os_prio=0 
tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000]
   java.lang.Thread.State: RUNNABLE
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457)
at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912)
at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453)
at 
org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98)
at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126)
at org.elasticsearch.index.store.Store.access$300(Store.java:76)
at 
org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465)
at 
org.elasticsearch.index.store.Store$MetadataSnapshot.init(Store.java:456)
at 
org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281)
at 
org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186)
at 
org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140)
at 
org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61)
at 
org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277)
at 
org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268)
at 
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

It didn't really make sense to see RUNNABLE threads in Object.wait(), but this 
seems to be symptomatic of deadlocks in static initialization 
(http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).

I found LUCENE-5573 as an instance of this having come up with Lucene code 
before.

I'm not sure what exactly is going on, but the deadlock in this case seems to 
involve these threads:

{noformat}
elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon prio=5 
os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() [0x7f79daed8000]
   java.lang.Thread.State: RUNNABLE
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at java.lang.Class.newInstance(Class.java:433)
at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67)
- locked 0x00061fef4968 (a org.apache.lucene.util.NamedSPILoader)
at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:47)
at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:37)
at 
org.apache.lucene.codecs.PostingsFormat.clinit(PostingsFormat.java:44)
at 
org.elasticsearch.index.codec.postingsformat.PostingFormats.clinit(PostingFormats.java:67)
at 
org.elasticsearch.index.codec.CodecModule.configurePostingsFormats(CodecModule.java:126)
at 
org.elasticsearch.index.codec.CodecModule.configure(CodecModule.java:178)
at 
org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
- locked 0x00061fef49e8 (a 
org.elasticsearch.index.codec.CodecModule)
at 
org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:204)
at 
org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:85)
at 
org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130)
at 
org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:99)
- locked 0x00061fef4c10 (a 
org.elasticsearch.common.inject.InheritingState)
at 

[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader

2015-05-14 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-6482:

Description: 
This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
4.9.1), with many threads seeming deadlocked but RUNNABLE:
{noformat}
elasticsearch[blabla-es0][clusterService#updateTask][T#1] #79 daemon prio=5 
os_prio=0 tid=0x7fd16988d000 nid=0x6e01 waiting on condition 
[0x7fd0bc279000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0x000614a22508 (a 
org.elasticsearch.common.util.concurrent.BaseFuture$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at 
org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:274)
at 
org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:113)
at 
org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:45)
at 
org.elasticsearch.gateway.local.LocalGatewayAllocator.buildShardStores(LocalGatewayAllocator.java:443)
at 
org.elasticsearch.gateway.local.LocalGatewayAllocator.allocateUnassigned(LocalGatewayAllocator.java:281)
at 
org.elasticsearch.cluster.routing.allocation.allocator.ShardsAllocators.allocateUnassigned(ShardsAllocators.java:74)
at 
org.elasticsearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:217)
at 
org.elasticsearch.cluster.routing.allocation.AllocationService.applyStartedShards(AllocationService.java:86)
at 
org.elasticsearch.cluster.action.shard.ShardStateAction$4.execute(ShardStateAction.java:278)
at 
org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:328)
at 
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

It didn't really make sense to see RUNNABLE threads in Object.wait(), but this 
seems to be symptomatic of deadlocks in static initialization 
(http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).

I found LUCENE-5573 as an instance of this having come up with Lucene code 
before.

I'm not sure what exactly is going on, but the deadlock in this case seems to 
involve these threads:

{noformat}
elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon prio=5 
os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() [0x7f79daed8000]
   java.lang.Thread.State: RUNNABLE
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at java.lang.Class.newInstance(Class.java:433)
at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67)
- locked 0x00061fef4968 (a org.apache.lucene.util.NamedSPILoader)
at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:47)
at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:37)
at 
org.apache.lucene.codecs.PostingsFormat.clinit(PostingsFormat.java:44)
at 
org.elasticsearch.index.codec.postingsformat.PostingFormats.clinit(PostingFormats.java:67)
at 
org.elasticsearch.index.codec.CodecModule.configurePostingsFormats(CodecModule.java:126)
at 
org.elasticsearch.index.codec.CodecModule.configure(CodecModule.java:178)
at 
org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
- locked 0x00061fef49e8 (a 
org.elasticsearch.index.codec.CodecModule)
at 
org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:204)
at 
org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:85)
at 
org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130)
at 
org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:99)
- locked 0x00061fef4c10 (a 

[jira] [Created] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader

2015-05-14 Thread Shikhar Bhushan (JIRA)
Shikhar Bhushan created LUCENE-6482:
---

 Summary: Class loading deadlock relating to NamedSPILoader
 Key: LUCENE-6482
 URL: https://issues.apache.org/jira/browse/LUCENE-6482
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Shikhar Bhushan


This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
4.9.1), seeing many threads seeming deadlocked but RUNNABLE:
{noformat}
elasticsearch[blabla-es0][clusterService#updateTask][T#1] #79 daemon prio=5 
os_prio=0 tid=0x7fd16988d000 nid=0x6e01 waiting on condition 
[0x7fd0bc279000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0x000614a22508 (a 
org.elasticsearch.common.util.concurrent.BaseFuture$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at 
org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:274)
at 
org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:113)
at 
org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:45)
at 
org.elasticsearch.gateway.local.LocalGatewayAllocator.buildShardStores(LocalGatewayAllocator.java:443)
at 
org.elasticsearch.gateway.local.LocalGatewayAllocator.allocateUnassigned(LocalGatewayAllocator.java:281)
at 
org.elasticsearch.cluster.routing.allocation.allocator.ShardsAllocators.allocateUnassigned(ShardsAllocators.java:74)
at 
org.elasticsearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:217)
at 
org.elasticsearch.cluster.routing.allocation.AllocationService.applyStartedShards(AllocationService.java:86)
at 
org.elasticsearch.cluster.action.shard.ShardStateAction$4.execute(ShardStateAction.java:278)
at 
org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:328)
at 
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

It didn't really make sense to see RUNNABLE threads in Object.wait(), but this 
seems to be symptomatic of deadlocks in static initialization 
(http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).

I found LUCENE-5573 as an instance of this having come up with Lucene code 
before.

I'm not sure what exactly is going on, but the deadlock in this case seems to 
involve these threads:

{noformat}
elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon prio=5 
os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() [0x7f79daed8000]
   java.lang.Thread.State: RUNNABLE
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at java.lang.Class.newInstance(Class.java:433)
at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67)
- locked 0x00061fef4968 (a org.apache.lucene.util.NamedSPILoader)
at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:47)
at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:37)
at 
org.apache.lucene.codecs.PostingsFormat.clinit(PostingsFormat.java:44)
at 
org.elasticsearch.index.codec.postingsformat.PostingFormats.clinit(PostingFormats.java:67)
at 
org.elasticsearch.index.codec.CodecModule.configurePostingsFormats(CodecModule.java:126)
at 
org.elasticsearch.index.codec.CodecModule.configure(CodecModule.java:178)
at 
org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
- locked 0x00061fef49e8 (a 
org.elasticsearch.index.codec.CodecModule)
at 
org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:204)
at 
org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:85)
at 
org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130)
at 

[jira] [Commented] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader

2015-05-14 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544349#comment-14544349
 ] 

Shikhar Bhushan commented on LUCENE-6482:
-

[~thetaphi] This was seen on JDK8u5, but I think this has also happened on 
JDK8u25 (not certain though...). 

The issue is not deterministic and comes up during cluster bounces sometimes, 
so it's hard to say whether an upgrade fixes it.

You're probably right that this has nothing to do with NamedSPILoader but the 
classes being loaded. Is that possible to conclude from the thread dump whether 
it is an ES or Lucene Codec / PostingsFormat is involved?

 Class loading deadlock relating to NamedSPILoader
 -

 Key: LUCENE-6482
 URL: https://issues.apache.org/jira/browse/LUCENE-6482
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9.1
Reporter: Shikhar Bhushan
Assignee: Uwe Schindler

 This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
 4.9.1), with many threads seeming deadlocked but RUNNABLE:
 {noformat}
 elasticsearch[search77-es2][generic][T#43] #160 daemon prio=5 os_prio=0 
 tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000]
java.lang.Thread.State: RUNNABLE
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
   at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758)
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453)
   at 
 org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98)
   at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126)
   at org.elasticsearch.index.store.Store.access$300(Store.java:76)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.init(Store.java:456)
   at 
 org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268)
   at 
 org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It didn't really make sense to see RUNNABLE threads in Object.wait(), but 
 this seems to be symptomatic of deadlocks in static initialization 
 (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).
 I found LUCENE-5573 as an instance of this having come up with Lucene code 
 before.
 I'm not sure what exactly is going on, but the deadlock in this case seems to 
 involve these threads:
 {noformat}
 elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon 
 prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() 
 [0x7f79daed8000]
java.lang.Thread.State: RUNNABLE
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
   at java.lang.Class.newInstance(Class.java:433)
   at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67)
   - locked 0x00061fef4968 (a org.apache.lucene.util.NamedSPILoader)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:47)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:37)
   at 
 org.apache.lucene.codecs.PostingsFormat.clinit(PostingsFormat.java:44)
   at 
 org.elasticsearch.index.codec.postingsformat.PostingFormats.clinit(PostingFormats.java:67)

[jira] [Comment Edited] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader

2015-05-14 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544349#comment-14544349
 ] 

Shikhar Bhushan edited comment on LUCENE-6482 at 5/14/15 8:57 PM:
--

[~thetaphi] This was seen on JDK8u5, but I think this has also happened on 
JDK8u25 (not certain though...). 

The issue is not deterministic and comes up during cluster bounces sometimes, 
so it's hard to say whether an ES upgrade fixes it.

You're probably right that this has nothing to do with NamedSPILoader but the 
classes being loaded. Is it possible to conclude from the thread dump whether 
an ES or Lucene Codec/PostingsFormat/etc is involved?


was (Author: shikhar):
[~thetaphi] This was seen on JDK8u5, but I think this has also happened on 
JDK8u25 (not certain though...). 

The issue is not deterministic and comes up during cluster bounces sometimes, 
so it's hard to say whether an upgrade fixes it.

You're probably right that this has nothing to do with NamedSPILoader but the 
classes being loaded. Is that possible to conclude from the thread dump whether 
it is an ES or Lucene Codec / PostingsFormat is involved?

 Class loading deadlock relating to NamedSPILoader
 -

 Key: LUCENE-6482
 URL: https://issues.apache.org/jira/browse/LUCENE-6482
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9.1
Reporter: Shikhar Bhushan
Assignee: Uwe Schindler

 This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
 4.9.1), with many threads seeming deadlocked but RUNNABLE:
 {noformat}
 elasticsearch[search77-es2][generic][T#43] #160 daemon prio=5 os_prio=0 
 tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000]
java.lang.Thread.State: RUNNABLE
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
   at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758)
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453)
   at 
 org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98)
   at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126)
   at org.elasticsearch.index.store.Store.access$300(Store.java:76)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.init(Store.java:456)
   at 
 org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268)
   at 
 org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It didn't really make sense to see RUNNABLE threads in Object.wait(), but 
 this seems to be symptomatic of deadlocks in static initialization 
 (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).
 I found LUCENE-5573 as an instance of this having come up with Lucene code 
 before.
 I'm not sure what exactly is going on, but the deadlock in this case seems to 
 involve these threads:
 {noformat}
 elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon 
 prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() 
 [0x7f79daed8000]
java.lang.Thread.State: RUNNABLE
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
   at 

[jira] [Closed] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader

2015-05-14 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan closed LUCENE-6482.
---
Resolution: Not A Problem

Thanks [~thetaphi], makes sense and it does not seem like a Lucene issue, so 
I'll close this.

It might have been due to using a custom Elasticsearch discovery plugin which 
is purely asynchronous that those 2 bits ended up happening in parallel, and 
caused the deadlock.

 Class loading deadlock relating to NamedSPILoader
 -

 Key: LUCENE-6482
 URL: https://issues.apache.org/jira/browse/LUCENE-6482
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9.1
Reporter: Shikhar Bhushan
Assignee: Uwe Schindler

 This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 
 4.9.1), with many threads seeming deadlocked but RUNNABLE:
 {noformat}
 elasticsearch[search77-es2][generic][T#43] #160 daemon prio=5 os_prio=0 
 tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000]
java.lang.Thread.State: RUNNABLE
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
   at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912)
   at 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758)
   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453)
   at 
 org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98)
   at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126)
   at org.elasticsearch.index.store.Store.access$300(Store.java:76)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465)
   at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.init(Store.java:456)
   at 
 org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140)
   at 
 org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277)
   at 
 org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268)
   at 
 org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It didn't really make sense to see RUNNABLE threads in Object.wait(), but 
 this seems to be symptomatic of deadlocks in static initialization 
 (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html).
 I found LUCENE-5573 as an instance of this having come up with Lucene code 
 before.
 I'm not sure what exactly is going on, but the deadlock in this case seems to 
 involve these threads:
 {noformat}
 elasticsearch[search77-es2][clusterService#updateTask][T#1] #79 daemon 
 prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() 
 [0x7f79daed8000]
java.lang.Thread.State: RUNNABLE
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
   at java.lang.Class.newInstance(Class.java:433)
   at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67)
   - locked 0x00061fef4968 (a org.apache.lucene.util.NamedSPILoader)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:47)
   at org.apache.lucene.util.NamedSPILoader.init(NamedSPILoader.java:37)
   at 
 org.apache.lucene.codecs.PostingsFormat.clinit(PostingsFormat.java:44)
   at 
 org.elasticsearch.index.codec.postingsformat.PostingFormats.clinit(PostingFormats.java:67)
   at 
 org.elasticsearch.index.codec.CodecModule.configurePostingsFormats(CodecModule.java:126)
   at 
 org.elasticsearch.index.codec.CodecModule.configure(CodecModule.java:178)
   at 
 

[jira] [Commented] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution

2015-02-27 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340544#comment-14340544
 ] 

Shikhar Bhushan commented on LUCENE-6294:
-

This is great. I saw some improvements when testing LUCENE-5299 with the 
addition of a configurable parallelism throttle at the search request level 
using a semaphore, that might be useful to have here too. I.e. being able to 
cap how many segments are concurrently searched. That can help ensure resources 
for concurrent search requests, or reduce context switching if using an 
unbounded pool.

 Generalize how IndexSearcher parallelizes collection execution
 --

 Key: LUCENE-6294
 URL: https://issues.apache.org/jira/browse/LUCENE-6294
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6294.patch


 IndexSearcher takes an ExecutorService that can be used to parallelize 
 collection execution. This is useful if you want to trade throughput for 
 latency.
 However, this executor service will only be used if you search for top docs. 
 In that case, we will create one collector per slide and call TopDocs.merge 
 in the end. If you use search(Query, Collector), the executor service will 
 never be used.
 But there are other collectors that could work the same way as top docs 
 collectors, eg. TotalHitCountCollector. And maybe also some of our users' 
 collectors. So maybe IndexSearcher could expose a generic way to take 
 advantage of the executor service?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism

2015-02-27 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340521#comment-14340521
 ] 

Shikhar Bhushan commented on LUCENE-5299:
-

LUCENE-6294 is definitely a less intrusive approach. I think the tradeoff is 
that by moving the parallelization into the {{Collector}} API itself, we can 
make it composable and work for any arbitrary permutation of parallelizable 
collectors.

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Fix For: Trunk, 5.1

 Attachments: LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, 
 LUCENE-5299.patch, LUCENE-5299.patch, benchmarks.txt


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution

2015-02-27 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340640#comment-14340640
 ] 

Shikhar Bhushan commented on LUCENE-6294:
-

Makes sense! Seems to be already customizable by overriding that method.

 Generalize how IndexSearcher parallelizes collection execution
 --

 Key: LUCENE-6294
 URL: https://issues.apache.org/jira/browse/LUCENE-6294
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6294.patch


 IndexSearcher takes an ExecutorService that can be used to parallelize 
 collection execution. This is useful if you want to trade throughput for 
 latency.
 However, this executor service will only be used if you search for top docs. 
 In that case, we will create one collector per slide and call TopDocs.merge 
 in the end. If you use search(Query, Collector), the executor service will 
 never be used.
 But there are other collectors that could work the same way as top docs 
 collectors, eg. TotalHitCountCollector. And maybe also some of our users' 
 collectors. So maybe IndexSearcher could expose a generic way to take 
 advantage of the executor service?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution

2015-02-27 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340661#comment-14340661
 ] 

Shikhar Bhushan edited comment on LUCENE-6294 at 2/27/15 7:25 PM:
--

When slicing differently than segment-per-slice, it'd probably be desirable to 
distribute segments by size across the slices, rather than all large segments 
ending up in one slice to be searched sequentially.


was (Author: shikhar):
When slicing differnetly than segment-per-slice, it'd probably be desirable to 
distribute segments by size across the slices, rather than all large segments 
ending up in one slice to be searched sequentially.

 Generalize how IndexSearcher parallelizes collection execution
 --

 Key: LUCENE-6294
 URL: https://issues.apache.org/jira/browse/LUCENE-6294
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6294.patch


 IndexSearcher takes an ExecutorService that can be used to parallelize 
 collection execution. This is useful if you want to trade throughput for 
 latency.
 However, this executor service will only be used if you search for top docs. 
 In that case, we will create one collector per slide and call TopDocs.merge 
 in the end. If you use search(Query, Collector), the executor service will 
 never be used.
 But there are other collectors that could work the same way as top docs 
 collectors, eg. TotalHitCountCollector. And maybe also some of our users' 
 collectors. So maybe IndexSearcher could expose a generic way to take 
 advantage of the executor service?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution

2015-02-27 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340661#comment-14340661
 ] 

Shikhar Bhushan commented on LUCENE-6294:
-

When slicing differnetly than segment-per-slice, it'd probably be desirable to 
distribute large segments by size across the slices, rather than all of them 
ending up in one slice to be searched sequentially.

 Generalize how IndexSearcher parallelizes collection execution
 --

 Key: LUCENE-6294
 URL: https://issues.apache.org/jira/browse/LUCENE-6294
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6294.patch


 IndexSearcher takes an ExecutorService that can be used to parallelize 
 collection execution. This is useful if you want to trade throughput for 
 latency.
 However, this executor service will only be used if you search for top docs. 
 In that case, we will create one collector per slide and call TopDocs.merge 
 in the end. If you use search(Query, Collector), the executor service will 
 never be used.
 But there are other collectors that could work the same way as top docs 
 collectors, eg. TotalHitCountCollector. And maybe also some of our users' 
 collectors. So maybe IndexSearcher could expose a generic way to take 
 advantage of the executor service?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution

2015-02-27 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340661#comment-14340661
 ] 

Shikhar Bhushan edited comment on LUCENE-6294 at 2/27/15 7:25 PM:
--

When slicing differnetly than segment-per-slice, it'd probably be desirable to 
distribute segments by size across the slices, rather than all large segments 
ending up in one slice to be searched sequentially.


was (Author: shikhar):
When slicing differnetly than segment-per-slice, it'd probably be desirable to 
distribute large segments by size across the slices, rather than all of them 
ending up in one slice to be searched sequentially.

 Generalize how IndexSearcher parallelizes collection execution
 --

 Key: LUCENE-6294
 URL: https://issues.apache.org/jira/browse/LUCENE-6294
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6294.patch


 IndexSearcher takes an ExecutorService that can be used to parallelize 
 collection execution. This is useful if you want to trade throughput for 
 latency.
 However, this executor service will only be used if you search for top docs. 
 In that case, we will create one collector per slide and call TopDocs.merge 
 in the end. If you use search(Query, Collector), the executor service will 
 never be used.
 But there are other collectors that could work the same way as top docs 
 collectors, eg. TotalHitCountCollector. And maybe also some of our users' 
 collectors. So maybe IndexSearcher could expose a generic way to take 
 advantage of the executor service?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism

2014-11-14 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213132#comment-14213132
 ] 

Shikhar Bhushan commented on LUCENE-5299:
-

Slides from my talk at Lucene/Solr Revolution 2014 about this stuff - 
https://www.dropbox.com/s/h2nqsml0beed0pm/Search-time%20Parallelism.pdf

Some backstory about the recent revival of this issue. The presentation was 
going to be a failure story since had not seen good performance on our test 
cluster when I tried it out last year.

However after adding that request-level 'parallelism' throttle and possibly 
eliminating some bugs in cherry-picking onto latest trunk - seen consistently 
good results. You can see from the replay graphs towards the end p99 dropping 
by half, a few hundred ms better for p95, and median looks much improved too. 
CPU usage was more, as expected, but about similar (I think less, but don't 
have numbers) than the overhead we saw by sharding and running all the shards 
on localhost. We are still sharded in this manner so as you can see we 
considered the latency win to be worth it!

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, 
 LUCENE-5299.patch, LUCENE-5299.patch, benchmarks.txt


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism

2014-10-27 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185751#comment-14185751
 ] 

Shikhar Bhushan commented on LUCENE-5299:
-

Just an update that the code rebased against recent trunk lives at 
https://github.com/shikhar/lucene-solr/tree/LUCENE-5299. I've made various 
tweaks, like being able to throttle per-request parallelism in 
{{ParallelSearchStrategy}}.

luceneutil bench numbers when running with ^
  + hacked IndexSearcher constructor that uses {{ParallelSearchStrategy(new 
ForkJoinPool(128), 8)}}
  + luceneutil constants.py SEARCH_NUM_THREADS = 16

Against trunk, on a 32 core (with HT) Sandy Bridge server, with source 
{{wikimedium500k}}

{noformat}
Report after iter 19:
TaskQPS baseline  StdDev  QPS parcol  StdDev
Pct diff
  Fuzzy1   81.91 (43.2%)   52.96 (39.7%)  
-35.3% ( -82% -   83%)
 LowTerm 2550.11 (11.9%) 1927.28  (5.6%)  
-24.4% ( -37% -   -7%)
 Respell   43.02 (39.4%)   35.23 (31.5%)  
-18.1% ( -63% -   87%)
  Fuzzy2   19.32 (25.1%)   16.40 (34.8%)  
-15.1% ( -59% -   59%)
 MedTerm 1679.37 (12.2%) 1743.27  (8.6%)
3.8% ( -15% -   28%)
PKLookup  221.58  (8.3%)  257.36 (13.2%)   
16.1% (  -4% -   41%)
  AndHighLow 1027.99 (11.6%) 1278.39 (15.9%)   
24.4% (  -2% -   58%)
  AndHighMed  741.50 (10.0%) 1198.04 (27.5%)   
61.6% (  21% -  110%)
   MedPhrase  709.04 (11.6%) 1203.02 (24.3%)   
69.7% (  30% -  119%)
 LowSpanNear  601.13 (16.9%) 1127.30 (16.7%)   
87.5% (  46% -  145%)
 LowSloppyPhrase  554.87 (10.8%) 1130.25 (30.5%)  
103.7% (  56% -  162%)
   OrHighMed  408.55 (10.4%)  977.56 (20.1%)  
139.3% (  98% -  189%)
   LowPhrase  364.36 (10.8%)  893.27 (41.0%)  
145.2% (  84% -  220%)
   OrHighLow  355.78 (12.7%)  893.63 (19.6%)  
151.2% ( 105% -  210%)
 AndHighHigh  390.73 (10.3%) 1004.70 (24.3%)  
157.1% ( 111% -  213%)
HighTerm  399.01 (11.8%) 1067.67 (12.1%)  
167.6% ( 128% -  217%)
Wildcard  754.76 (11.6%) 2067.96 (28.0%)  
174.0% ( 120% -  241%)
HighSpanNear  153.57 (14.8%)  463.54 (24.3%)  
201.8% ( 141% -  282%)
  OrHighHigh  212.16 (12.4%)  665.56 (28.2%)  
213.7% ( 154% -  290%)
  HighPhrase  170.49 (13.1%)  547.72 (17.3%)  
221.3% ( 168% -  289%)
HighSloppyPhrase   66.91 (10.1%)  219.59 (12.0%)  
228.2% ( 187% -  278%)
 MedSloppyPhrase  128.73 (12.5%)  425.67 (20.3%)  
230.7% ( 175% -  300%)
 MedSpanNear  130.31 (10.7%)  436.12 (18.2%)  
234.7% ( 185% -  295%)
 Prefix3  166.91 (14.9%)  652.64 (26.7%)  
291.0% ( 217% -  390%)
  IntNRQ  110.73 (15.0%)  467.72 (33.6%)  
322.4% ( 238% -  436%)
{noformat}


 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, 
 LUCENE-5299.patch, LUCENE-5299.patch, benchmarks.txt


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A 

[jira] [Comment Edited] (LUCENE-5299) Refactor Collector API for parallelism

2014-10-27 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185751#comment-14185751
 ] 

Shikhar Bhushan edited comment on LUCENE-5299 at 10/27/14 9:01 PM:
---

Just an update that the code rebased against recent trunk lives at 
https://github.com/shikhar/lucene-solr/tree/LUCENE-5299. I've made various 
tweaks, like being able to throttle per-request parallelism in 
{{ParallelSearchStrategy}}.

luceneutil bench numbers when running with ^  hacked IndexSearcher constructor 
that uses {{ParallelSearchStrategy(new ForkJoinPool(128), 8)}}, against trunk, 
on a 32 core (with HT) Sandy Bridge server, with source {{wikimedium500k}}

SEARCH_NUM_THREADS = 16
{noformat}
Report after iter 19:
TaskQPS baseline  StdDev  QPS parcol  StdDev
Pct diff
  Fuzzy1   81.91 (43.2%)   52.96 (39.7%)  
-35.3% ( -82% -   83%)
 LowTerm 2550.11 (11.9%) 1927.28  (5.6%)  
-24.4% ( -37% -   -7%)
 Respell   43.02 (39.4%)   35.23 (31.5%)  
-18.1% ( -63% -   87%)
  Fuzzy2   19.32 (25.1%)   16.40 (34.8%)  
-15.1% ( -59% -   59%)
 MedTerm 1679.37 (12.2%) 1743.27  (8.6%)
3.8% ( -15% -   28%)
PKLookup  221.58  (8.3%)  257.36 (13.2%)   
16.1% (  -4% -   41%)
  AndHighLow 1027.99 (11.6%) 1278.39 (15.9%)   
24.4% (  -2% -   58%)
  AndHighMed  741.50 (10.0%) 1198.04 (27.5%)   
61.6% (  21% -  110%)
   MedPhrase  709.04 (11.6%) 1203.02 (24.3%)   
69.7% (  30% -  119%)
 LowSpanNear  601.13 (16.9%) 1127.30 (16.7%)   
87.5% (  46% -  145%)
 LowSloppyPhrase  554.87 (10.8%) 1130.25 (30.5%)  
103.7% (  56% -  162%)
   OrHighMed  408.55 (10.4%)  977.56 (20.1%)  
139.3% (  98% -  189%)
   LowPhrase  364.36 (10.8%)  893.27 (41.0%)  
145.2% (  84% -  220%)
   OrHighLow  355.78 (12.7%)  893.63 (19.6%)  
151.2% ( 105% -  210%)
 AndHighHigh  390.73 (10.3%) 1004.70 (24.3%)  
157.1% ( 111% -  213%)
HighTerm  399.01 (11.8%) 1067.67 (12.1%)  
167.6% ( 128% -  217%)
Wildcard  754.76 (11.6%) 2067.96 (28.0%)  
174.0% ( 120% -  241%)
HighSpanNear  153.57 (14.8%)  463.54 (24.3%)  
201.8% ( 141% -  282%)
  OrHighHigh  212.16 (12.4%)  665.56 (28.2%)  
213.7% ( 154% -  290%)
  HighPhrase  170.49 (13.1%)  547.72 (17.3%)  
221.3% ( 168% -  289%)
HighSloppyPhrase   66.91 (10.1%)  219.59 (12.0%)  
228.2% ( 187% -  278%)
 MedSloppyPhrase  128.73 (12.5%)  425.67 (20.3%)  
230.7% ( 175% -  300%)
 MedSpanNear  130.31 (10.7%)  436.12 (18.2%)  
234.7% ( 185% -  295%)
 Prefix3  166.91 (14.9%)  652.64 (26.7%)  
291.0% ( 217% -  390%)
  IntNRQ  110.73 (15.0%)  467.72 (33.6%)  
322.4% ( 238% -  436%)
{noformat}

SEARCH_NUM_THREADS=32
{noformat}
TaskQPS baseline  StdDev  QPS parcol  StdDev
Pct diff
 LowTerm 2401.88 (12.7%) 1799.27  (6.3%)  
-25.1% ( -39% -   -6%)
  Fuzzy26.52 (14.4%)5.74 (24.0%)  
-11.9% ( -43% -   30%)
 Respell   45.13 (90.2%)   40.94 (83.5%)   
-9.3% ( -96% - 1679%)
PKLookup  232.02 (12.9%)  228.35 (12.4%)   
-1.6% ( -23% -   27%)
 MedTerm 1612.01 (14.0%) 1601.71 (10.9%)   
-0.6% ( -22% -   28%)
  Fuzzy1   14.19 (79.3%)   14.71(177.6%)
3.7% (-141% - 1258%)
  AndHighLow 1205.65 (17.5%) 1254.76 (15.9%)
4.1% ( -24% -   45%)
 MedSpanNear  478.11 (25.4%)  946.72 (34.5%)   
98.0% (  30% -  211%)
   OrHighLow  424.71 (14.5%)  941.39 (31.4%)  
121.7% (  66% -  195%)
 AndHighHigh  377.82 (13.3%)  910.77 (32.2%)  
141.1% (  84% -  215%)
HighTerm  325.35 (11.3%)  855.63  (8.9%)  
163.0% ( 128% -  206%)
  AndHighMed  346.57 (11.7%)  914.59 (26.4%)  
163.9% ( 112% -  228%)
   MedPhrase  227.47 (13.1%)  621.50 (22.9%)  
173.2% ( 121% -  240%)
 LowSloppyPhrase  265.21 (10.4%)  748.30 (49.2%)  
182.2% ( 110% -  269%)
   OrHighMed  221.49 (12.2%)  632.55 (23.9%)  
185.6% ( 133% -  

[jira] [Closed] (SOLR-5648) SolrCore#getStatistics() should nest open searchers' stats

2014-05-21 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan closed SOLR-5648.
-

Resolution: Invalid

bq. 1) I'm not sure i really understand what this adds – isn't every registered 
searcher (which should include every open searcher if there are more then one) 
already listed in the infoRegistry (so it's stats are surfaced in /admin/mbeans 
and via JMX) ?

you're right! that's much better.

 SolrCore#getStatistics() should nest open searchers' stats
 --

 Key: SOLR-5648
 URL: https://issues.apache.org/jira/browse/SOLR-5648
 Project: Solr
  Issue Type: Task
Reporter: Shikhar Bhushan
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: SOLR-5648.patch, oldestSearcherStaleness.gif, 
 openSearchers.gif


 {{SolrIndexSearcher}} leaks are a notable cause of garbage collection issues 
 in codebases with custom components.
 So it is useful to be able to access monitoring information about what 
 searchers are currently open, and in turn access their stats e.g. 
 {{openedAt}}.
 This can be nested via {{SolrCore#getStatistics()}} which has a 
 {{_searchers}} collection of all open searchers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6105) DebugComponent NPE when single-pass distributed search is used

2014-05-21 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004796#comment-14004796
 ] 

Shikhar Bhushan commented on SOLR-6105:
---

paging [~shalinmangar] in case you have any idea what might be going on

 DebugComponent NPE when single-pass distributed search is used
 --

 Key: SOLR-6105
 URL: https://issues.apache.org/jira/browse/SOLR-6105
 Project: Solr
  Issue Type: Bug
Reporter: Shikhar Bhushan
Priority: Minor

 I'm seeing NPE's in {{DebugComponent}} with debugQuery=true when just ID  
 score are requested, which enables the single-pass distributed search 
 optimization from SOLR-1880.
 The NPE originates on this line in DebugComponent.finishStage():
 {noformat}
 int idx = sdoc.positionInResponse;
 {noformat}
 indicating an ID that is in the explain but missing in the resultIds.
 I'm afraid I haven't been able to reproduce this in 
 {{DistributedQueryComponentOptimizationTest}}, but wanted to open this ticket 
 in any case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6105) DebugComponent NPE when single-pass distributed search is used

2014-05-21 Thread Shikhar Bhushan (JIRA)
Shikhar Bhushan created SOLR-6105:
-

 Summary: DebugComponent NPE when single-pass distributed search is 
used
 Key: SOLR-6105
 URL: https://issues.apache.org/jira/browse/SOLR-6105
 Project: Solr
  Issue Type: Bug
Reporter: Shikhar Bhushan
Priority: Minor


I'm seeing NPE's in {{DebugComponent}} with debugQuery=true when just ID  
score are requested, which enables the single-pass distributed search 
optimization from SOLR-1880.

The NPE originates on this line in DebugComponent.finishStage():

{noformat}
int idx = sdoc.positionInResponse;
{noformat}

indicating an ID that is in the explain but missing in the resultIds.

I'm afraid I haven't been able to reproduce this in 
{{DistributedQueryComponentOptimizationTest}}, but wanted to open this ticket 
in any case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6105) DebugComponent NPE when single-pass distributed search is used

2014-05-21 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004799#comment-14004799
 ] 

Shikhar Bhushan commented on SOLR-6105:
---

also paging [~vzhovtiuk] - presumably you're using this feature in your app. 
does debugQuery=true work ok for you?

 DebugComponent NPE when single-pass distributed search is used
 --

 Key: SOLR-6105
 URL: https://issues.apache.org/jira/browse/SOLR-6105
 Project: Solr
  Issue Type: Bug
Reporter: Shikhar Bhushan
Priority: Minor

 I'm seeing NPE's in {{DebugComponent}} with debugQuery=true when just ID  
 score are requested, which enables the single-pass distributed search 
 optimization from SOLR-1880.
 The NPE originates on this line in DebugComponent.finishStage():
 {noformat}
 int idx = sdoc.positionInResponse;
 {noformat}
 indicating an ID that is in the explain but missing in the resultIds.
 I'm afraid I haven't been able to reproduce this in 
 {{DistributedQueryComponentOptimizationTest}}, but wanted to open this ticket 
 in any case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected

2014-05-16 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999732#comment-13999732
 ] 

Shikhar Bhushan commented on LUCENE-4370:
-

Been thinking about the semantics of these done callbacks not being invoked in 
case of exceptions which was a concern raised by [~jpountz] in LUCENE-5527, 
this seems to be not very helpful when e.g. you have a TimeExceededException or 
EarlyTerminatingCollectorException thrown and you need to maybe merge in some 
state into the parent collector in {{LeafCollector.leafDone()}}, or perhaps 
finalize results in {{Collector.done()}}.

Maybe we need a special kind of exception, just like 
CollectionTerminatedException. The semantics for CollectionTerminatedException 
are currently that collection continues with the next leaf. So some new 
base-class for the rethrow me but invoke done callbacks case?

In case of any other kinds of exception like IOException, I don't think we 
should be invoking done() callbacks because the collector's results should not 
be expected to be usable.

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Attachments: LUCENE-4370.patch, LUCENE-4370.patch


 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4370) Let Collector know when all docs have been collected

2014-05-15 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-4370:


Attachment: (was: LUCENE-4370.patch)

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor

 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4370) Let Collector know when all docs have been collected

2014-05-14 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-4370:


Attachment: LUCENE-4370.patch

attaching another version which adds a callback on both Collector {{void 
done();}} as well as on LeafCollector {{void leafDone();}}

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Attachments: LUCENE-4370.patch, LUCENE-4370.patch


 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected

2014-05-14 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997666#comment-13997666
 ] 

Shikhar Bhushan commented on LUCENE-4370:
-

 On one hand I think a Collector.finish() would be nice, but the argument 
 could be made you could handle this yourself (its done with 
 IndexSearcher.search returns).

Such a technique does not compose easily e.g. when you want to wrap collectors 
in other collectors, unless you customize each and every one in the chain.

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Attachments: LUCENE-4370.patch, LUCENE-4370.patch


 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected

2014-05-13 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995070#comment-13995070
 ] 

Shikhar Bhushan commented on LUCENE-4370:
-

Umm, I totally forgot about the callers. Updated patch coming.

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Attachments: LUCENE-4370.patch, LUCENE-4370.patch


 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4370) Let Collector know when all docs have been collected

2014-05-12 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-4370:


Attachment: LUCENE-4370.patch

Attaching patch.

I think there is a huge potential for cleanups if this goes in, I'm happy to 
work on some of that.

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Attachments: LUCENE-4370.patch, LUCENE-4370.patch


 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4370) Let Collector know when all docs have been collected

2014-05-12 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-4370:


Attachment: LUCENE-4370.patch

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Attachments: LUCENE-4370.patch, LUCENE-4370.patch


 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4370) Let Collector know when all docs have been collected

2014-05-12 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-4370:


Attachment: (was: LUCENE-4370.patch)

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor

 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4370) Let Collector know when all docs have been collected

2014-05-12 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-4370:


Attachment: LUCENE-4370.patch

Attaching patch. I updated callers based on auditing usages of 
{{Collector.getLeafCollector(..)}}

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Attachments: LUCENE-4370.patch


 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected

2014-04-06 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961394#comment-13961394
 ] 

Shikhar Bhushan commented on LUCENE-4370:
-

Proposal: {{LeafCollector}} (in trunk via LUCENE-5527) gets a method {{void 
finish();}

Semantics: It is invoked when collection with that leaf has completed. It is 
not invoked if collection does terminates due to an exception.

I know this ticket was originally about having such a method on {{Collector}} 
and not at the segment-level collection, however I think all use cases can be 
cleanly modelled in this manner.

As naming goes, I think {{finish()}} or {{done()}} or such is better than 
{{close()}}, which implies a try-finally'esque construct.

/cc [~jpountz] [~rcmuir] [~hossman]

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor

 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-4370) Let Collector know when all docs have been collected

2014-04-06 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961394#comment-13961394
 ] 

Shikhar Bhushan edited comment on LUCENE-4370 at 4/6/14 12:44 PM:
--

Proposal: {{LeafCollector}} (in trunk via LUCENE-5527) gets a method {{void 
done();}

Semantics: It is invoked when collection with that leaf has completed. It is 
not invoked if collection does terminates due to an exception.

I know this ticket was originally about having such a method on {{Collector}} 
and not at the segment-level collection, however I think all use cases can be 
cleanly modelled in this manner.

As naming goes, I think {{done()}} or such is better than {{close()}}, which 
implies a try-finally'esque construct.

Edit: changed my proposal from {{finish()}} to {{done()}} to avoid messing with 
existing uses e.g. {{DelegatingCollector}} which would currently extend 
{{SimpleCollector}} that implements both {{Collector}} and {{LeafCollector}}.

/cc [~jpountz] [~rcmuir] [~hossman]


was (Author: shikhar):
Proposal: {{LeafCollector}} (in trunk via LUCENE-5527) gets a method {{void 
finish();}

Semantics: It is invoked when collection with that leaf has completed. It is 
not invoked if collection does terminates due to an exception.

I know this ticket was originally about having such a method on {{Collector}} 
and not at the segment-level collection, however I think all use cases can be 
cleanly modelled in this manner.

As naming goes, I think {{finish()}} or {{done()}} or such is better than 
{{close()}}, which implies a try-finally'esque construct.

/cc [~jpountz] [~rcmuir] [~hossman]

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor

 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-4370) Let Collector know when all docs have been collected

2014-04-06 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961394#comment-13961394
 ] 

Shikhar Bhushan edited comment on LUCENE-4370 at 4/6/14 12:45 PM:
--

Proposal: {{LeafCollector}} (in trunk via LUCENE-5527) gets a method {{void 
done();}

Semantics: It is invoked when collection with that leaf has completed. It is 
not invoked if collection terminates due to an exception.

I know this ticket was originally about having such a method on {{Collector}} 
and not at the segment-level collection, however I think all use cases can be 
cleanly modelled in this manner.

As naming goes, I think {{done()}} or such is better than {{close()}}, which 
implies a try-finally'esque construct.

Edit: changed my proposal from {{finish()}} to {{done()}} to avoid messing with 
existing uses e.g. {{DelegatingCollector}} which would currently extend 
{{SimpleCollector}} that implements both {{Collector}} and {{LeafCollector}}.

/cc [~jpountz] [~rcmuir] [~hossman]


was (Author: shikhar):
Proposal: {{LeafCollector}} (in trunk via LUCENE-5527) gets a method {{void 
done();}

Semantics: It is invoked when collection with that leaf has completed. It is 
not invoked if collection does terminates due to an exception.

I know this ticket was originally about having such a method on {{Collector}} 
and not at the segment-level collection, however I think all use cases can be 
cleanly modelled in this manner.

As naming goes, I think {{done()}} or such is better than {{close()}}, which 
implies a try-finally'esque construct.

Edit: changed my proposal from {{finish()}} to {{done()}} to avoid messing with 
existing uses e.g. {{DelegatingCollector}} which would currently extend 
{{SimpleCollector}} that implements both {{Collector}} and {{LeafCollector}}.

/cc [~jpountz] [~rcmuir] [~hossman]

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor

 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5527) Make the Collector API work per-segment

2014-04-03 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13959101#comment-13959101
 ] 

Shikhar Bhushan commented on LUCENE-5527:
-

Thanks for picking this up Adrien! I always wanted to push forward at least the 
API refactoring but did not get the chance to do so.

+1 on adding a method like LeafCollector.done()  / finish() or such, and making 
that part of the usage contract.

It's not just Solr with DelegatingCollector that has something like this, I 
think I remember seeing this pattern even in ES.

LUCENE-5299 had this as a SubCollector.done() method and it led to a lot of 
code-cleanup at various places where we were trying to detect a transition to 
the next segment based on a call to setNextReader(). In some cases, the result 
finalization was being done lazily when result retrieval methods are being 
called, because there is no other good way of knowing that the last segment has 
been processed.

 Make the Collector API work per-segment
 ---

 Key: LUCENE-5527
 URL: https://issues.apache.org/jira/browse/LUCENE-5527
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Priority: Minor
 Fix For: 5.0

 Attachments: LUCENE-5527.patch


 Spin-off of LUCENE-5299.
 LUCENE-5229 proposes different changes, some of them being controversial, but 
 there is one of them that I really really like that consists in refactoring 
 the {{Collector}} API in order to have a different Collector per segment.
 The idea is, instead of having a single Collector object that needs to be 
 able to take care of all segments, to have a top-level Collector:
 {code}
 public interface Collector {
   AtomicCollector setNextReader(AtomicReaderContext context) throws 
 IOException;
   
 }
 {code}
 and a per-AtomicReaderContext collector:
 {code}
 public interface AtomicCollector {
   void setScorer(Scorer scorer) throws IOException;
   void collect(int doc) throws IOException;
   boolean acceptsDocsOutOfOrder();
 }
 {code}
 I think it makes the API clearer since it is now obious {{setScorer}} and 
 {{acceptDocsOutOfOrder}} need to be called after {{setNextReader}} which is 
 otherwise unclear.
 It also makes things more flexible. For example, a collector could much more 
 easily decide to use different strategies on different segments. In 
 particular, it makes the early-termination collector much cleaner since it 
 can return different atomic collectors implementations depending on whether 
 the current segment is sorted or not.
 Even if we have lots of collectors all over the place, we could make it 
 easier to migrate by having a Collector that would implement both Collector 
 and AtomicCollector, return {{this}} in setNextReader and make current 
 concrete Collector implementations extend this class instead of directly 
 extending Collector.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5637) Per-request cache statistics

2014-03-06 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated SOLR-5637:
--

Fix Version/s: (was: 4.7)

 Per-request cache statistics
 

 Key: SOLR-5637
 URL: https://issues.apache.org/jira/browse/SOLR-5637
 Project: Solr
  Issue Type: New Feature
Reporter: Shikhar Bhushan
Priority: Minor
 Attachments: SOLR-5367.patch, SOLR-5367.patch


 We have found it very useful to have information on the number of cache hits 
 and misses for key Solr caches (filterCache, documentCache, etc.) at the 
 request level.
 This is currently implemented in our codebase using custom {{SolrCache}} 
 implementations.
 I am working on moving to maintaining stats in the {{SolrRequestInfo}} 
 thread-local, and adding hooks in get() methods of SolrCache implementations. 
 This will be glued up using the {{DebugComponent}} and can be requested using 
 a debug.cache parameter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5637) Per-request cache statistics

2014-03-06 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated SOLR-5637:
--

Attachment: SOLR-5637.patch

updated patch against lucene_solr_4_7 branch

 Per-request cache statistics
 

 Key: SOLR-5637
 URL: https://issues.apache.org/jira/browse/SOLR-5637
 Project: Solr
  Issue Type: New Feature
Reporter: Shikhar Bhushan
Priority: Minor
 Attachments: SOLR-5367.patch, SOLR-5367.patch, SOLR-5637.patch


 We have found it very useful to have information on the number of cache hits 
 and misses for key Solr caches (filterCache, documentCache, etc.) at the 
 request level.
 This is currently implemented in our codebase using custom {{SolrCache}} 
 implementations.
 I am working on moving to maintaining stats in the {{SolrRequestInfo}} 
 thread-local, and adding hooks in get() methods of SolrCache implementations. 
 This will be glued up using the {{DebugComponent}} and can be requested using 
 a debug.cache parameter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5768) Add a distrib.singlePass parameter to make GET_FIELDS phase fetch all fields and skip EXECUTE_QUERY

2014-03-06 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922669#comment-13922669
 ] 

Shikhar Bhushan commented on SOLR-5768:
---

seems like the JIRA title has it the other way round :)

 Add a distrib.singlePass parameter to make GET_FIELDS phase fetch all fields 
 and skip EXECUTE_QUERY
 ---

 Key: SOLR-5768
 URL: https://issues.apache.org/jira/browse/SOLR-5768
 Project: Solr
  Issue Type: Improvement
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 4.8, 5.0


 Suggested by Yonik on solr-user:
 http://www.mail-archive.com/solr-user@lucene.apache.org/msg95045.html
 {quote}
 Although it seems like it should be relatively simple to make it work
 with other fields as well, by passing down the complete fl requested
 if some optional parameter is set (distrib.singlePass?)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5648) SolrCore#getStatistics() should nest open searchers' stats

2014-01-24 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880795#comment-13880795
 ] 

Shikhar Bhushan commented on SOLR-5648:
---

[~otis] yup

 SolrCore#getStatistics() should nest open searchers' stats
 --

 Key: SOLR-5648
 URL: https://issues.apache.org/jira/browse/SOLR-5648
 Project: Solr
  Issue Type: Task
Reporter: Shikhar Bhushan
Priority: Minor
 Fix For: 4.7

 Attachments: SOLR-5648.patch, oldestSearcherStaleness.gif, 
 openSearchers.gif


 {{SolrIndexSearcher}} leaks are a notable cause of garbage collection issues 
 in codebases with custom components.
 So it is useful to be able to access monitoring information about what 
 searchers are currently open, and in turn access their stats e.g. 
 {{openedAt}}.
 This can be nested via {{SolrCore#getStatistics()}} which has a 
 {{_searchers}} collection of all open searchers.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5648) SolrCore#getStatistics() should nest open searchers' stats

2014-01-21 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated SOLR-5648:
--

Fix Version/s: 4.7

 SolrCore#getStatistics() should nest open searchers' stats
 --

 Key: SOLR-5648
 URL: https://issues.apache.org/jira/browse/SOLR-5648
 Project: Solr
  Issue Type: Task
Reporter: Shikhar Bhushan
Priority: Minor
 Fix For: 4.7

 Attachments: SOLR-5648.patch, oldestSearcherStaleness.gif, 
 openSearchers.gif


 {{SolrIndexSearcher}} leaks are a notable cause of garbage collection issues 
 in codebases with custom components.
 So it is useful to be able to access monitoring information about what 
 searchers are currently open, and in turn access their stats e.g. 
 {{openedAt}}.
 This can be nested via {{SolrCore#getStatistics()}} which has a 
 {{_searchers}} collection of all open searchers.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5637) Per-request cache statistics

2014-01-21 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated SOLR-5637:
--

Fix Version/s: 4.7

 Per-request cache statistics
 

 Key: SOLR-5637
 URL: https://issues.apache.org/jira/browse/SOLR-5637
 Project: Solr
  Issue Type: New Feature
Reporter: Shikhar Bhushan
Priority: Minor
 Fix For: 4.7

 Attachments: SOLR-5367.patch, SOLR-5367.patch


 We have found it very useful to have information on the number of cache hits 
 and misses for key Solr caches (filterCache, documentCache, etc.) at the 
 request level.
 This is currently implemented in our codebase using custom {{SolrCache}} 
 implementations.
 I am working on moving to maintaining stats in the {{SolrRequestInfo}} 
 thread-local, and adding hooks in get() methods of SolrCache implementations. 
 This will be glued up using the {{DebugComponent}} and can be requested using 
 a debug.cache parameter.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup

2014-01-21 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated SOLR-5505:
--

Fix Version/s: 4.7

 LoggingInfoStream not usabe in a multi-core setup
 -

 Key: SOLR-5505
 URL: https://issues.apache.org/jira/browse/SOLR-5505
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Shikhar Bhushan
 Fix For: 4.7

 Attachments: SOLR-5505.patch, SOLR-5505.patch


 {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core 
 context.
 Previously this was possible by encoding this into the infoStream's file path.
 This means in a multi-core setup it is very hard to distinguish between the 
 infoStream messages for different cores.
 {{LoggingInfoStream}} should be automatically configured to prepend the core 
 name to log messages.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5637) Per-request cache statistics

2014-01-20 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated SOLR-5637:
--

Attachment: SOLR-5367.patch

works in the distrib case now

 Per-request cache statistics
 

 Key: SOLR-5637
 URL: https://issues.apache.org/jira/browse/SOLR-5637
 Project: Solr
  Issue Type: New Feature
Reporter: Shikhar Bhushan
Priority: Minor
 Attachments: SOLR-5367.patch, SOLR-5367.patch


 We have found it very useful to have information on the number of cache hits 
 and misses for key Solr caches (filterCache, documentCache, etc.) at the 
 request level.
 This is currently implemented in our codebase using custom {{SolrCache}} 
 implementations.
 I am working on moving to maintaining stats in the {{SolrRequestInfo}} 
 thread-local, and adding hooks in get() methods of SolrCache implementations. 
 This will be glued up using the {{DebugComponent}} and can be requested using 
 a debug.cache parameter.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5637) Per-request cache statistics

2014-01-20 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876798#comment-13876798
 ] 

Shikhar Bhushan edited comment on SOLR-5637 at 1/20/14 7:54 PM:


Works in the distrib case now, though end up getting aggregate numbers out via 
{{DebugComponent#merge()} -- an enhancement might be to make the stats be part 
of the 'track' response from shards.


was (Author: shikhar):
works in the distrib case now

 Per-request cache statistics
 

 Key: SOLR-5637
 URL: https://issues.apache.org/jira/browse/SOLR-5637
 Project: Solr
  Issue Type: New Feature
Reporter: Shikhar Bhushan
Priority: Minor
 Attachments: SOLR-5367.patch, SOLR-5367.patch


 We have found it very useful to have information on the number of cache hits 
 and misses for key Solr caches (filterCache, documentCache, etc.) at the 
 request level.
 This is currently implemented in our codebase using custom {{SolrCache}} 
 implementations.
 I am working on moving to maintaining stats in the {{SolrRequestInfo}} 
 thread-local, and adding hooks in get() methods of SolrCache implementations. 
 This will be glued up using the {{DebugComponent}} and can be requested using 
 a debug.cache parameter.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5637) Per-request cache statistics

2014-01-20 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876798#comment-13876798
 ] 

Shikhar Bhushan edited comment on SOLR-5637 at 1/20/14 7:55 PM:


Works in the distrib case now, though end up getting aggregate numbers out via 
{{DebugComponent#merge()}} -- an enhancement might be to make the stats be part 
of the 'track' response from shards.


was (Author: shikhar):
Works in the distrib case now, though end up getting aggregate numbers out via 
{{DebugComponent#merge()} -- an enhancement might be to make the stats be part 
of the 'track' response from shards.

 Per-request cache statistics
 

 Key: SOLR-5637
 URL: https://issues.apache.org/jira/browse/SOLR-5637
 Project: Solr
  Issue Type: New Feature
Reporter: Shikhar Bhushan
Priority: Minor
 Attachments: SOLR-5367.patch, SOLR-5367.patch


 We have found it very useful to have information on the number of cache hits 
 and misses for key Solr caches (filterCache, documentCache, etc.) at the 
 request level.
 This is currently implemented in our codebase using custom {{SolrCache}} 
 implementations.
 I am working on moving to maintaining stats in the {{SolrRequestInfo}} 
 thread-local, and adding hooks in get() methods of SolrCache implementations. 
 This will be glued up using the {{DebugComponent}} and can be requested using 
 a debug.cache parameter.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5637) Per-request cache statistics

2014-01-20 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876801#comment-13876801
 ] 

Shikhar Bhushan commented on SOLR-5637:
---

For caches where this instrumentation is not desirable, it can be opted out via 
the XML init arg perRequestStats for the SolrCache (takes boolean values 
true / false). It currently defaults to true.

 Per-request cache statistics
 

 Key: SOLR-5637
 URL: https://issues.apache.org/jira/browse/SOLR-5637
 Project: Solr
  Issue Type: New Feature
Reporter: Shikhar Bhushan
Priority: Minor
 Attachments: SOLR-5367.patch, SOLR-5367.patch


 We have found it very useful to have information on the number of cache hits 
 and misses for key Solr caches (filterCache, documentCache, etc.) at the 
 request level.
 This is currently implemented in our codebase using custom {{SolrCache}} 
 implementations.
 I am working on moving to maintaining stats in the {{SolrRequestInfo}} 
 thread-local, and adding hooks in get() methods of SolrCache implementations. 
 This will be glued up using the {{DebugComponent}} and can be requested using 
 a debug.cache parameter.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5648) SolrCore#getStatistics() should nest open searchers' stats

2014-01-20 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated SOLR-5648:
--

Description: 
{{SolrIndexSearcher}} leaks are a notable cause of garbage collection issues in 
codebases with custom components.

So it is useful to be able to access monitoring information about what 
searchers are currently open, and in turn access their stats e.g. {{openedAt}}.

This can be nested via {{SolrCore#getStatistics()}} which has a {{_searchers}} 
collection of all open searchers.

  was:
{{SolrIndexSearcher}} leaks are  cause of garbage collection issues in 
codebases with custom components.

So it is useful to be able to access monitoring information about what 
searchers are currently open, and in turn access their stats e.g. {{openedAt}}.

This can be nested via {{SolrCore#getStatistics()}} which has a {{_searchers}} 
collection of all open searchers.


 SolrCore#getStatistics() should nest open searchers' stats
 --

 Key: SOLR-5648
 URL: https://issues.apache.org/jira/browse/SOLR-5648
 Project: Solr
  Issue Type: Task
Reporter: Shikhar Bhushan
Priority: Minor

 {{SolrIndexSearcher}} leaks are a notable cause of garbage collection issues 
 in codebases with custom components.
 So it is useful to be able to access monitoring information about what 
 searchers are currently open, and in turn access their stats e.g. 
 {{openedAt}}.
 This can be nested via {{SolrCore#getStatistics()}} which has a 
 {{_searchers}} collection of all open searchers.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5648) SolrCore#getStatistics() should nest open searchers' stats

2014-01-20 Thread Shikhar Bhushan (JIRA)
Shikhar Bhushan created SOLR-5648:
-

 Summary: SolrCore#getStatistics() should nest open searchers' stats
 Key: SOLR-5648
 URL: https://issues.apache.org/jira/browse/SOLR-5648
 Project: Solr
  Issue Type: Task
Reporter: Shikhar Bhushan
Priority: Minor


{{SolrIndexSearcher}} leaks are  cause of garbage collection issues in 
codebases with custom components.

So it is useful to be able to access monitoring information about what 
searchers are currently open, and in turn access their stats e.g. {{openedAt}}.

This can be nested via {{SolrCore#getStatistics()}} which has a {{_searchers}} 
collection of all open searchers.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5648) SolrCore#getStatistics() should nest open searchers' stats

2014-01-20 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated SOLR-5648:
--

Attachment: SOLR-5648.patch

Patch attached.

Note that the {{_searchers}} access is synchronized on {{searcherLock}} as per 
the usage pattern established in the class. It does not seem like that lock is 
held for too long wherever it is used, so this should be ok.

 SolrCore#getStatistics() should nest open searchers' stats
 --

 Key: SOLR-5648
 URL: https://issues.apache.org/jira/browse/SOLR-5648
 Project: Solr
  Issue Type: Task
Reporter: Shikhar Bhushan
Priority: Minor
 Attachments: SOLR-5648.patch


 {{SolrIndexSearcher}} leaks are a notable cause of garbage collection issues 
 in codebases with custom components.
 So it is useful to be able to access monitoring information about what 
 searchers are currently open, and in turn access their stats e.g. 
 {{openedAt}}.
 This can be nested via {{SolrCore#getStatistics()}} which has a 
 {{_searchers}} collection of all open searchers.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5637) Per-request cache statistics

2014-01-16 Thread Shikhar Bhushan (JIRA)
Shikhar Bhushan created SOLR-5637:
-

 Summary: Per-request cache statistics
 Key: SOLR-5637
 URL: https://issues.apache.org/jira/browse/SOLR-5637
 Project: Solr
  Issue Type: New Feature
Reporter: Shikhar Bhushan
Priority: Minor


We have found it very useful to have information on the number of cache hits 
and misses for key Solr caches (filterCache, documentCache, etc.) at the 
request level.

This is currently implemented in our codebase using custom {{SolrCache}} 
implementations.

I am working on moving to maintaining stats in the {{SolrRequestInfo}} 
thread-local, and adding hooks in get() methods of SolrCache implementations. 
This will be glued up using the {{DebugComponent}} and can be requested using a 
debug.cache parameter.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5637) Per-request cache statistics

2014-01-16 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated SOLR-5637:
--

Attachment: SOLR-5367.patch

first cut of patch attached for feedback.

it needs to be made to work in the distributed case from {{DebugComponent}} and 
i'm still figuring things out for that. pointers appreciated :)

 Per-request cache statistics
 

 Key: SOLR-5637
 URL: https://issues.apache.org/jira/browse/SOLR-5637
 Project: Solr
  Issue Type: New Feature
Reporter: Shikhar Bhushan
Priority: Minor
 Attachments: SOLR-5367.patch


 We have found it very useful to have information on the number of cache hits 
 and misses for key Solr caches (filterCache, documentCache, etc.) at the 
 request level.
 This is currently implemented in our codebase using custom {{SolrCache}} 
 implementations.
 I am working on moving to maintaining stats in the {{SolrRequestInfo}} 
 thread-local, and adding hooks in get() methods of SolrCache implementations. 
 This will be glued up using the {{DebugComponent}} and can be requested using 
 a debug.cache parameter.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2014-01-16 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873851#comment-13873851
 ] 

Shikhar Bhushan commented on SOLR-4260:
---

This may be unrelated - I have not done much digging or looked at the full 
context, but was just looking at CUSS out of curiosity.

Why do we flush() the OutputStream, but then write() on stuff like ending tags? 
Shouldn't the flush be after all those writes()'s?

https://github.com/apache/lucene-solr/blob/lucene_solr_4_6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.java#L205

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Assignee: Mark Miller
Priority: Critical
 Fix For: 5.0, 4.7

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png, 
 demo_shard1_replicas_out_of_sync.tgz


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5629) SolrIndexSearcher.name should include core name

2014-01-14 Thread Shikhar Bhushan (JIRA)
Shikhar Bhushan created SOLR-5629:
-

 Summary: SolrIndexSearcher.name should include core name
 Key: SOLR-5629
 URL: https://issues.apache.org/jira/browse/SOLR-5629
 Project: Solr
  Issue Type: Improvement
Reporter: Shikhar Bhushan
Priority: Minor


The name attribute on {{SolrIndexSearcher}} is used in log lines, but does not 
include the core name.

So in a multi-core setup it is unnecessarily difficult to trace what core's 
searcher is being referred to, e.g. in log lines that provide info on searcher 
opens  closes.

One-line patch that helps:

Replace

{noformat}
this.name = Searcher@ + Integer.toHexString(hashCode()) + (name!=null ?  
+name : );
{noformat}

with

{noformat}
this.name = Searcher@ + Integer.toHexString(hashCode()) + [ + 
core.getName() + ] + (name!=null ?  +name : );
{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5629) SolrIndexSearcher.name should include core name

2014-01-14 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870796#comment-13870796
 ] 

Shikhar Bhushan commented on SOLR-5629:
---

Thanks Erick! Yup, in SolrIndexSearcher constructor :)

 SolrIndexSearcher.name should include core name
 ---

 Key: SOLR-5629
 URL: https://issues.apache.org/jira/browse/SOLR-5629
 Project: Solr
  Issue Type: Improvement
Reporter: Shikhar Bhushan
Assignee: Erick Erickson
Priority: Minor

 The name attribute on {{SolrIndexSearcher}} is used in log lines, but does 
 not include the core name.
 So in a multi-core setup it is unnecessarily difficult to trace what core's 
 searcher is being referred to, e.g. in log lines that provide info on 
 searcher opens  closes.
 One-line patch that helps:
 Replace
 {noformat}
 this.name = Searcher@ + Integer.toHexString(hashCode()) + (name!=null ?  
 +name : );
 {noformat}
 with
 {noformat}
 this.name = Searcher@ + Integer.toHexString(hashCode()) + [ + 
 core.getName() + ] + (name!=null ?  +name : );
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism

2013-12-12 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847146#comment-13847146
 ] 

Shikhar Bhushan commented on LUCENE-5299:
-

Thanks for your comments Otis. I have certainly run into the situation of not 
seeing improvements when there is a higher degree of concurrency of search 
requests. So I want to try to pin down the associated costs (cost of merge, 
blocking operations, context switching, number/size of segments, etc.)

I think this could have real-world applicability, but I don't have evidence yet 
in terms of a high query concurrency benchmark. Let's take as an example a 
32-core server that serves 100 QPS at an average latency of 100ms. You'd expect 
10 search tasks/threads to be active on average. So in theory you have 22 cores 
available for helping out with the search.

 If this parallelization is optional and those who choose not to use it don't 
 suffer from it, then this may be a good option to have for those with 
 multi-core CPUs with low query concurrency, but if that's not the case

It is optional and it is possible for parallelizable collectors to be written 
in a way that does not penalize the serial use case. E.g. the modifications to 
{{TopScoreDocCollector}} use a single {{PriorityQueue}} in the serial case, and 
a {{PriorityQueue}} for each {{AtomicReaderContext}} + 1 for the final merge in 
case parallelism is used. In the lucene-util benchmarks I ran I did not see a 
penalty on serial search with the patch.

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, 
 LUCENE-5299.patch, LUCENE-5299.patch, benchmarks.txt


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup

2013-12-03 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837860#comment-13837860
 ] 

Shikhar Bhushan edited comment on SOLR-5505 at 12/3/13 4:30 PM:


Hi Ryan,

* If the loggerName attribute is missing, it defaults to the fully-qualified 
class name of LoggingInfoStream (see default value used for getting the 
attribute).

* I will update the example solrconfig.xml, good call!

* Even if the logs are being sent to the same file, the logger name is almost 
always part of the formatter configuration. For the solrconfig.xml perhaps a 
good example would be
{noformat}
infoStream 
loggerName=org.apache.solr.update.LoggingInfoStream.${solr.core.name}
{noformat}

(I _think_ that actually will substitute core name correctly, will check...).


was (Author: shikhar):
Hi Ryan,

* If the loggerName attribute is missing, it defaults to the fully-qualified 
class name of LoggingInfoStream (see default value used for getting the 
attribute).

* I will update the example solrconfig.xml, good call!

* Even if the logs are being sent to the same file, the logger name is almost 
always part of the formatter configuration. For the solrconfig.xml perhaps a 
good example would be {{infoStream 
loggerName=org.apache.solr.update.LoggingInfoStream.${solr.core.name}}} (I 
_think_ that actually will substitute core name correctly, will check...).

 LoggingInfoStream not usabe in a multi-core setup
 -

 Key: SOLR-5505
 URL: https://issues.apache.org/jira/browse/SOLR-5505
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Shikhar Bhushan
 Attachments: SOLR-5505.patch


 {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core 
 context.
 Previously this was possible by encoding this into the infoStream's file path.
 This means in a multi-core setup it is very hard to distinguish between the 
 infoStream messages for different cores.
 {{LoggingInfoStream}} should be automatically configured to prepend the core 
 name to log messages.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup

2013-12-03 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837860#comment-13837860
 ] 

Shikhar Bhushan commented on SOLR-5505:
---

Hi Ryan,

* If the loggerName attribute is missing, it defaults to the fully-qualified 
class name of LoggingInfoStream (see default value used for getting the 
attribute).

* I will update the example solrconfig.xml, good call!

* Even if the logs are being sent to the same file, the logger name is almost 
always part of the formatter configuration. For the solrconfig.xml perhaps a 
good example would be {{infoStream 
loggerName=org.apache.solr.update.LoggingInfoStream.${solr.core.name}}} (I 
_think_ that actually will substitute core name correctly, will check...).

 LoggingInfoStream not usabe in a multi-core setup
 -

 Key: SOLR-5505
 URL: https://issues.apache.org/jira/browse/SOLR-5505
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Shikhar Bhushan
 Attachments: SOLR-5505.patch


 {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core 
 context.
 Previously this was possible by encoding this into the infoStream's file path.
 This means in a multi-core setup it is very hard to distinguish between the 
 infoStream messages for different cores.
 {{LoggingInfoStream}} should be automatically configured to prepend the core 
 name to log messages.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup

2013-12-03 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837860#comment-13837860
 ] 

Shikhar Bhushan edited comment on SOLR-5505 at 12/3/13 7:18 PM:


Hi Ryan,

* If the loggerName attribute is missing, it defaults to the fully-qualified 
class name of LoggingInfoStream (see default value used for getting the 
attribute).

* I will update the example solrconfig.xml, good call!

* Even if the logs are being sent to the same file, the logger name is almost 
always part of the formatter configuration. For the solrconfig.xml perhaps a 
good example would be
{noformat}
infoStream 
loggerName=org.apache.solr.update.LoggingInfoStream.${solr.core.name}true/infoStream
{noformat}

(I _think_ that actually will substitute core name correctly, will check...).


was (Author: shikhar):
Hi Ryan,

* If the loggerName attribute is missing, it defaults to the fully-qualified 
class name of LoggingInfoStream (see default value used for getting the 
attribute).

* I will update the example solrconfig.xml, good call!

* Even if the logs are being sent to the same file, the logger name is almost 
always part of the formatter configuration. For the solrconfig.xml perhaps a 
good example would be
{noformat}
infoStream 
loggerName=org.apache.solr.update.LoggingInfoStream.${solr.core.name}
{noformat}

(I _think_ that actually will substitute core name correctly, will check...).

 LoggingInfoStream not usabe in a multi-core setup
 -

 Key: SOLR-5505
 URL: https://issues.apache.org/jira/browse/SOLR-5505
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Shikhar Bhushan
 Attachments: SOLR-5505.patch


 {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core 
 context.
 Previously this was possible by encoding this into the infoStream's file path.
 This means in a multi-core setup it is very hard to distinguish between the 
 infoStream messages for different cores.
 {{LoggingInfoStream}} should be automatically configured to prepend the core 
 name to log messages.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup

2013-12-03 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated SOLR-5505:
--

Attachment: SOLR-5505.patch

attaching patch with updated example solrconfig.xml

 LoggingInfoStream not usabe in a multi-core setup
 -

 Key: SOLR-5505
 URL: https://issues.apache.org/jira/browse/SOLR-5505
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Shikhar Bhushan
 Attachments: SOLR-5505.patch, SOLR-5505.patch


 {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core 
 context.
 Previously this was possible by encoding this into the infoStream's file path.
 This means in a multi-core setup it is very hard to distinguish between the 
 infoStream messages for different cores.
 {{LoggingInfoStream}} should be automatically configured to prepend the core 
 name to log messages.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup

2013-11-27 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated SOLR-5505:
--

Attachment: SOLR-5505.patch

Attaching patch against trunk.

It does something different than what I proposed earlier:

a) {{LoggingInfoStream}} constructor takes the slf4j {{Logger}} instance to be 
used as a constructor param.

b) {{SolrIndexConfig}} checks if there is a loggerName configuration 
attribute on the infoStream tag, and if so this is used as the name for the 
{{Logger}}. Otherwise, the previous default of the {{LoggingInfoStream}} class 
name is used. This will enable users to manage the log output using their 
logging subsystem, e.g. the formatting pattern, to what log file etc.

b) Additionally, I removed logging of the thread name from within 
{{LoggingInfoStream}}, since this is commonly configured at the level of the 
formatting patter for a logger.

 LoggingInfoStream not usabe in a multi-core setup
 -

 Key: SOLR-5505
 URL: https://issues.apache.org/jira/browse/SOLR-5505
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Shikhar Bhushan
 Attachments: SOLR-5505.patch


 {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core 
 context.
 Previously this was possible by encoding this into the infoStream's file path.
 This means in a multi-core setup it is very hard to distinguish between the 
 infoStream messages for different cores.
 {{LoggingInfoStream}} should be automatically configured to prepend the core 
 name to log messages.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup

2013-11-26 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832802#comment-13832802
 ] 

Shikhar Bhushan commented on SOLR-5505:
---

I'll create a patch today

 LoggingInfoStream not usabe in a multi-core setup
 -

 Key: SOLR-5505
 URL: https://issues.apache.org/jira/browse/SOLR-5505
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Shikhar Bhushan

 {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core 
 context.
 Previously this was possible by encoding this into the infoStream's file path.
 This means in a multi-core setup it is very hard to distinguish between the 
 infoStream messages for different cores.
 {{LoggingInfoStream}} should be automatically configured to prepend the core 
 name to log messages.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup

2013-11-25 Thread Shikhar Bhushan (JIRA)
Shikhar Bhushan created SOLR-5505:
-

 Summary: LoggingInfoStream not usabe in a multi-core setup
 Key: SOLR-5505
 URL: https://issues.apache.org/jira/browse/SOLR-5505
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Shikhar Bhushan


{{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core 
context.

Previously this was possible by encoding this into the infoStream's file path.

This means in a multi-core setup it is very hard to distinguish between the 
infoStream messages for different cores.

{{LoggingInfoStream}} should be automatically configured to prepend the core 
name to log messages.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4977) info stream in solrconfig should have option for writing to the solr log

2013-11-25 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831932#comment-13831932
 ] 

Shikhar Bhushan commented on SOLR-4977:
---

LoggingInfoStream does not log the core name which is an important piece of 
context - created SOLR-5505 for this

 info stream in solrconfig should have option for writing to the solr log
 

 Key: SOLR-4977
 URL: https://issues.apache.org/jira/browse/SOLR-4977
 Project: Solr
  Issue Type: Improvement
Reporter: Ryan Ernst
 Fix For: 4.4, 5.0

 Attachments: SOLR-4977.patch, SOLR-4977.patch, SOLR-4977.patch, 
 SOLR-4977.patch, SOLR-4977.patch, SOLR-4977.patch, SOLR-4977.patch


 Having a separate file is annoying, plus the print stream option doesn't 
 rollover on size or date, doesn't have custom formatting options, etc.  
 Exactly what the logging lib is meant to handle.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup

2013-11-25 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831937#comment-13831937
 ] 

Shikhar Bhushan commented on SOLR-5505:
---

This should be a simple patch, {{SolrIndexConfig}} can propagate the core name 
to the {{LoggingInfoStream}} constructor so that it's available for logging.

 LoggingInfoStream not usabe in a multi-core setup
 -

 Key: SOLR-5505
 URL: https://issues.apache.org/jira/browse/SOLR-5505
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Shikhar Bhushan

 {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core 
 context.
 Previously this was possible by encoding this into the infoStream's file path.
 This means in a multi-core setup it is very hard to distinguish between the 
 infoStream messages for different cores.
 {{LoggingInfoStream}} should be automatically configured to prepend the core 
 name to log messages.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5299) Refactor Collector API for parallelism

2013-10-27 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-5299:


Attachment: LUCENE-5299.patch

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: benchmarks.txt, LUCENE-5299.patch, LUCENE-5299.patch, 
 LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5363) NoClassDefFoundError when using Apache Log4J2

2013-10-23 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802926#comment-13802926
 ] 

Shikhar Bhushan commented on SOLR-5363:
---

Confirming the issue  the Petar's assessment, ran into this as well

 NoClassDefFoundError when using Apache Log4J2
 -

 Key: SOLR-5363
 URL: https://issues.apache.org/jira/browse/SOLR-5363
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5
Reporter: Petar Tahchiev
  Labels: log4j2
 Attachments: SOLR-5363.patch


 Hey guys,
 I'm using Log4J2 + SLF4J in my project. Unfortunately my embedded solr server 
 throws this error when starting:
 {code}
 Caused by: org.springframework.beans.factory.BeanDefinitionStoreException: 
 Factory method [public org.springframework.da
 ta.solr.core.SolrOperations 
 com.x.platform.core.config.SolrsearchConfig.defaultSolrTemplate() throws 
 javax.xml.par
 sers.ParserConfigurationException,java.io.IOException,org.xml.sax.SAXException]
  threw exception; nested exception is org
 .springframework.beans.factory.BeanCreationException: Error creating bean 
 with name 'defaultSolrServer' defined in class
  path resource [com/x/platform/core/config/SolrsearchConfig.class]: 
 Instantiation of bean failed; nested exception
  is org.springframework.beans.factory.BeanDefinitionStoreException: Factory 
 method [public org.apache.solr.client.solrj.
 SolrServer 
 com.xx.platform.core.config.SolrsearchConfig.defaultSolrServer() throws 
 javax.xml.parsers.ParserConfigur
 ationException,java.io.IOException,org.xml.sax.SAXException] threw exception; 
 nested exception is java.lang.NoClassDefFo
 undError: org/apache/log4j/Priority
 at 
 org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy
 .java:181)
 at 
 org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolv
 er.java:570)
 ... 105 more
 Caused by: org.springframework.beans.factory.BeanCreationException: Error 
 creating bean with name 'defaultSolrServer' de
 fined in class path resource 
 [com/xx/platform/core/config/SolrsearchConfig.class]: Instantiation of 
 bean failed; ne
 sted exception is 
 org.springframework.beans.factory.BeanDefinitionStoreException: Factory 
 method [public org.apache.solr
 .client.solrj.SolrServer 
 com.xxx.platform.core.config.SolrsearchConfig.defaultSolrServer() throws 
 javax.xml.parsers.
 ParserConfigurationException,java.io.IOException,org.xml.sax.SAXException] 
 threw exception; nested exception is java.lan
 g.NoClassDefFoundError: org/apache/log4j/Priority
 at 
 org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolv
 er.java:581)
 at 
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(Ab
 stractAutowireCapableBeanFactory.java:1025)
 at 
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutow
 ireCapableBeanFactory.java:921)
 at 
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCap
 ableBeanFactory.java:487)
 at 
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapab
 leBeanFactory.java:458)
 at 
 org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:295)
 at 
 org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegis
 try.java:223)
 at 
 org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:292)
 at 
 org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194)
 at 
 org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(Configurati
 onClassEnhancer.java:298)
 at 
 com.xx.platform.core.config.SolrsearchConfig$$EnhancerByCGLIB$$c571c5a6.defaultSolrServer(generated)
 at 
 com.x.platform.core.config.SolrsearchConfig.defaultSolrTemplate(SolrsearchConfig.java:37)
 at 
 com.xx.platform.core.config.SolrsearchConfig$$EnhancerByCGLIB$$c571c5a6.CGLIB$defaultSolrTemplate$2(gen
 erated)
 at 
 com.x.platform.core.config.SolrsearchConfig$$EnhancerByCGLIB$$c571c5a6$$FastClassByCGLIB$$f67069c2.invo
 ke(generated)
 at 
 org.springframework.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
 at 
 org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(Configurati
 onClassEnhancer.java:286)
 at 
 

[jira] [Comment Edited] (SOLR-5363) NoClassDefFoundError when using Apache Log4J2

2013-10-23 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802926#comment-13802926
 ] 

Shikhar Bhushan edited comment on SOLR-5363 at 10/23/13 3:03 PM:
-

Confirming the issue  Petar's assessment, ran into this as well


was (Author: shikhar):
Confirming the issue  the Petar's assessment, ran into this as well

 NoClassDefFoundError when using Apache Log4J2
 -

 Key: SOLR-5363
 URL: https://issues.apache.org/jira/browse/SOLR-5363
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5
Reporter: Petar Tahchiev
  Labels: log4j2
 Attachments: SOLR-5363.patch


 Hey guys,
 I'm using Log4J2 + SLF4J in my project. Unfortunately my embedded solr server 
 throws this error when starting:
 {code}
 Caused by: org.springframework.beans.factory.BeanDefinitionStoreException: 
 Factory method [public org.springframework.da
 ta.solr.core.SolrOperations 
 com.x.platform.core.config.SolrsearchConfig.defaultSolrTemplate() throws 
 javax.xml.par
 sers.ParserConfigurationException,java.io.IOException,org.xml.sax.SAXException]
  threw exception; nested exception is org
 .springframework.beans.factory.BeanCreationException: Error creating bean 
 with name 'defaultSolrServer' defined in class
  path resource [com/x/platform/core/config/SolrsearchConfig.class]: 
 Instantiation of bean failed; nested exception
  is org.springframework.beans.factory.BeanDefinitionStoreException: Factory 
 method [public org.apache.solr.client.solrj.
 SolrServer 
 com.xx.platform.core.config.SolrsearchConfig.defaultSolrServer() throws 
 javax.xml.parsers.ParserConfigur
 ationException,java.io.IOException,org.xml.sax.SAXException] threw exception; 
 nested exception is java.lang.NoClassDefFo
 undError: org/apache/log4j/Priority
 at 
 org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy
 .java:181)
 at 
 org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolv
 er.java:570)
 ... 105 more
 Caused by: org.springframework.beans.factory.BeanCreationException: Error 
 creating bean with name 'defaultSolrServer' de
 fined in class path resource 
 [com/xx/platform/core/config/SolrsearchConfig.class]: Instantiation of 
 bean failed; ne
 sted exception is 
 org.springframework.beans.factory.BeanDefinitionStoreException: Factory 
 method [public org.apache.solr
 .client.solrj.SolrServer 
 com.xxx.platform.core.config.SolrsearchConfig.defaultSolrServer() throws 
 javax.xml.parsers.
 ParserConfigurationException,java.io.IOException,org.xml.sax.SAXException] 
 threw exception; nested exception is java.lan
 g.NoClassDefFoundError: org/apache/log4j/Priority
 at 
 org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolv
 er.java:581)
 at 
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(Ab
 stractAutowireCapableBeanFactory.java:1025)
 at 
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutow
 ireCapableBeanFactory.java:921)
 at 
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCap
 ableBeanFactory.java:487)
 at 
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapab
 leBeanFactory.java:458)
 at 
 org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:295)
 at 
 org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegis
 try.java:223)
 at 
 org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:292)
 at 
 org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194)
 at 
 org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(Configurati
 onClassEnhancer.java:298)
 at 
 com.xx.platform.core.config.SolrsearchConfig$$EnhancerByCGLIB$$c571c5a6.defaultSolrServer(generated)
 at 
 com.x.platform.core.config.SolrsearchConfig.defaultSolrTemplate(SolrsearchConfig.java:37)
 at 
 com.xx.platform.core.config.SolrsearchConfig$$EnhancerByCGLIB$$c571c5a6.CGLIB$defaultSolrTemplate$2(gen
 erated)
 at 
 com.x.platform.core.config.SolrsearchConfig$$EnhancerByCGLIB$$c571c5a6$$FastClassByCGLIB$$f67069c2.invo
 ke(generated)
 at 
 org.springframework.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
 at 
 

[jira] [Updated] (LUCENE-5299) Refactor Collector API for parallelism

2013-10-23 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-5299:


Attachment: LUCENE-5299.patch

Attaching latest patch. Broken up into commits at 
https://github.com/shikhar/lucene-solr/compare/apache:trunk...trunk?w=1.

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: benchmarks.txt, LUCENE-5299.patch, LUCENE-5299.patch, 
 LUCENE-5299.patch, LUCENE-5299.patch


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5299) Refactor Collector API for parallelism

2013-10-21 Thread Shikhar Bhushan (JIRA)
Shikhar Bhushan created LUCENE-5299:
---

 Summary: Refactor Collector API for parallelism
 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan


h2. Motivation

We should be able to scale-up better with Solr/Lucene by utilizing multiple CPU 
cores, and not have to resort to scaling-out by sharding (with all the 
associated distributed system pitfalls) when the index size does not warrant it.

Presently, IndexSearcher has an optional constructor arg for an 
ExecutorService, which gets used for searching in parallel for call paths where 
one of the TopDocCollector's is created internally. The per-atomic-reader 
search happens in parallel and then the TopDocs/TopFieldDocs results are merged 
with locking around the merge bit.

However there are some problems with this approach:

* If arbitary Collector args come into play, we can't parallelize. Note that 
even if ultimately results are going to a TopDocCollector it may be wrapped 
inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
* The special-casing with parallelism baked on top does not scale, there are 
many Collector's that could potentially lend themselves to parallelism, and 
special-casing means the parallelization has to be re-implemented if a 
different permutation of collectors is to be used.

h2. Proposal

A refactoring of collectors that allows for parallelization at the level of the 
collection protocol. 

Some requirements that should guide the implementation:

* easy migration path for collectors that need to remain serial
* the parallelization should be composable (when collectors wrap other 
collectors)
* allow collectors to pick the optimal solution (e.g. there might be memory 
tradeoffs to be made) by advising the collector about whether a search will be 
parallelized, so that the serial use-case is not penalized.
* encourage use of non-blocking constructs and lock-free parallelism, blocking 
is not advisable for the hot-spot of a search, besides wasting pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5372) SolrIndexSearcher should support propagating an ExecutorService upto IndexSearcher constructor

2013-10-21 Thread Shikhar Bhushan (JIRA)
Shikhar Bhushan created SOLR-5372:
-

 Summary: SolrIndexSearcher should support propagating an 
ExecutorService upto IndexSearcher constructor
 Key: SOLR-5372
 URL: https://issues.apache.org/jira/browse/SOLR-5372
 Project: Solr
  Issue Type: Improvement
Reporter: Shikhar Bhushan


This could probably be made this configurable from solrconfig.xml. We should be 
able to easily configure the kind of executor to be chosen with params.

The idea here is to benefit from improvements being proposed in LUCENE-5299



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism

2013-10-21 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800762#comment-13800762
 ] 

Shikhar Bhushan commented on LUCENE-5299:
-

patch and benchmarks to come...

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan

 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5299) Refactor Collector API for parallelism

2013-10-21 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-5299:


Attachment: benchmarks.txt
LUCENE-5299.patch

attaching patch against trunk + benchmarks

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: benchmarks.txt, LUCENE-5299.patch


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (LUCENE-5299) Refactor Collector API for parallelism

2013-10-21 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-5299:


Comment: was deleted

(was: patch and benchmarks to come...)

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: benchmarks.txt, LUCENE-5299.patch


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism

2013-10-21 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800947#comment-13800947
 ] 

Shikhar Bhushan commented on LUCENE-5299:
-

bq. Could you describe a bit about the high level design changes?

There is an overview in this email under 'Idea': 
http://mail-archives.apache.org/mod_mbox/lucene-dev/201310.mbox/%3CCAE_Gd_dt6LY5T9r6ty%2B1j2xEbdr84OCPkU5swsQn10cbDt81Ew%40mail.gmail.com%3E

bq. In the benchmarks, is par vs par the before/after test? Ie baseline = 
current trunk, passed an ES to IndexSearcher, and then comp = with this patch, 
also passing ES to IndexSearcher?

Exactly, sorry that wasn't made clear.

bq. In general, I suspect fine grained parallelism is trickier / most costly 
then the merge in the end parallelism we have now. Typically collection is 
not a very costly part of the search ... and merging the results in the end 
should be a minor cost, that shrinks as the index gets larger.

Typically collection is not a very costly part of the search - I don't know 
if that's true. Are you referring to just the bits that might happen inside a 
Collector, or a broader definition of collection as including scoring and 
potentially some degree of I/O? This change is aiming to parallelize the 
latter. To do this the Collector API needs refactoring to cleanly separate out 
the AtomicReader-level state and the composite state, in case they are 
different. 

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: benchmarks.txt, LUCENE-5299.patch


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism

2013-10-21 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800954#comment-13800954
 ] 

Shikhar Bhushan commented on LUCENE-5299:
-

Thanks for your comments [~thetaphi], I really appreciate the vote of 
confidence in the API changes :)

bq. My biggest concern is not complexity of API (it is actually simplier and 
easier to understand!): it is more the fact that parallelism of Lucene Queries 
is in most cases not the best thing to do (if you have many users). It only 
makes sense if you have very few queries - which is not where full-text 
searches are used for. The overhead for merging is higher than what you get, 
especially when many users hit your search engine in parallel! I generally 
don't recommend to users to use the parallelization currently available in 
IndexSearcher. Every user gets one thread and if you have many users buy more 
processors. With additional parallelism this does not scale if userbase grows.

There is certainly more work to be done overall per search-request for the 
Collector's where parallelization = merge step(s) [1]. It could mean better 
latency at the cost of additional hardware to sustain the same level of load. 
But it's a choice that should be available when developing search applications.

[1] there are trivially parallelizable collectors where the merge step is 
either really small or non-existent: e.g. TotalHitCountCollector, or even 
FacetCollector 
(https://github.com/shikhar/lucene-solr/commit/032683da739bf15c1a8afe9f15cb2586baa0b201?w=1)

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: benchmarks.txt, LUCENE-5299.patch


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism

2013-10-21 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801019#comment-13801019
 ] 

Shikhar Bhushan commented on LUCENE-5299:
-

I'm planning to work on parallelizing TopFieldCollector in the same way as for 
TopScoreDocCollector, so the special-casing from IndexSearcher can be removed 
and searches are parallelizable even if that collector gets wrapped in 
something else by Solr. 

We am going to be doing some load-tests and latency measurements on one of our 
experimental clusters using real traffic logs, and I will report those 
findings. But first need to do that work on TopFieldCollector as most of our 
requests have multiple sort fields.

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: benchmarks.txt, LUCENE-5299.patch


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5299) Refactor Collector API for parallelism

2013-10-21 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801019#comment-13801019
 ] 

Shikhar Bhushan edited comment on LUCENE-5299 at 10/21/13 8:08 PM:
---

I'm planning to work on parallelizing TopFieldCollector in the same way as for 
TopScoreDocCollector, so the special-casing from IndexSearcher can be removed 
and searches are parallelizable even if that collector gets wrapped in 
something else by Solr. 

We are going to be doing some load-tests and latency measurements on one of our 
experimental clusters using real traffic logs, and I will report those 
findings. But first need to do that work on TopFieldCollector as most of our 
requests have multiple sort fields.


was (Author: shikhar):
I'm planning to work on parallelizing TopFieldCollector in the same way as for 
TopScoreDocCollector, so the special-casing from IndexSearcher can be removed 
and searches are parallelizable even if that collector gets wrapped in 
something else by Solr. 

We am going to be doing some load-tests and latency measurements on one of our 
experimental clusters using real traffic logs, and I will report those 
findings. But first need to do that work on TopFieldCollector as most of our 
requests have multiple sort fields.

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: benchmarks.txt, LUCENE-5299.patch


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism

2013-10-21 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801028#comment-13801028
 ] 

Shikhar Bhushan commented on LUCENE-5299:
-

bq. What do you have the number of search threads set to in luceneutil?

I did not change any of the defaults - what setting is this?

bq. If this is too low, maybe its not utilizing all your hardware in the 
benchmark. (like a web server with a too-small PQ)

What's a PQ? :)

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: benchmarks.txt, LUCENE-5299.patch


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5299) Refactor Collector API for parallelism

2013-10-21 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-5299:


Attachment: LUCENE-5299.patch

Attaching patch with the TopFieldCollector changes + removal of bunch of 
unnecessary code from IndexSearcher

Tests pass except for TestExpressionSorts sometimes (see LUCENE-5222), will 
reopen that and provide fix.

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: benchmarks.txt, LUCENE-5299.patch, LUCENE-5299.patch


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5299) Refactor Collector API for parallelism

2013-10-21 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-5299:


Attachment: LUCENE-5299.patch

 Refactor Collector API for parallelism
 --

 Key: LUCENE-5299
 URL: https://issues.apache.org/jira/browse/LUCENE-5299
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
 Attachments: benchmarks.txt, LUCENE-5299.patch, LUCENE-5299.patch, 
 LUCENE-5299.patch


 h2. Motivation
 We should be able to scale-up better with Solr/Lucene by utilizing multiple 
 CPU cores, and not have to resort to scaling-out by sharding (with all the 
 associated distributed system pitfalls) when the index size does not warrant 
 it.
 Presently, IndexSearcher has an optional constructor arg for an 
 ExecutorService, which gets used for searching in parallel for call paths 
 where one of the TopDocCollector's is created internally. The 
 per-atomic-reader search happens in parallel and then the 
 TopDocs/TopFieldDocs results are merged with locking around the merge bit.
 However there are some problems with this approach:
 * If arbitary Collector args come into play, we can't parallelize. Note that 
 even if ultimately results are going to a TopDocCollector it may be wrapped 
 inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
 * The special-casing with parallelism baked on top does not scale, there are 
 many Collector's that could potentially lend themselves to parallelism, and 
 special-casing means the parallelization has to be re-implemented if a 
 different permutation of collectors is to be used.
 h2. Proposal
 A refactoring of collectors that allows for parallelization at the level of 
 the collection protocol. 
 Some requirements that should guide the implementation:
 * easy migration path for collectors that need to remain serial
 * the parallelization should be composable (when collectors wrap other 
 collectors)
 * allow collectors to pick the optimal solution (e.g. there might be memory 
 tradeoffs to be made) by advising the collector about whether a search will 
 be parallelized, so that the serial use-case is not penalized.
 * encourage use of non-blocking constructs and lock-free parallelism, 
 blocking is not advisable for the hot-spot of a search, besides wasting 
 pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-09-12 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765739#comment-13765739
 ] 

Shikhar Bhushan commented on SOLR-4816:
---

Thanks Mark! Also for adding call to lbServer.shutdown() when appropriate.

This is a really minor thing, but I later realized 

{{final MapString, FutureNamedList? responseFutures = new HashMapString, 
FutureNamedList?();}}

is better declared with an initialCapacity as that is known

{{final MapString, FutureNamedList? responseFutures = new HashMapString, 
FutureNamedList?(routes.size());}}

 Add document routing to CloudSolrServer
 ---

 Key: SOLR-4816
 URL: https://issues.apache.org/jira/browse/SOLR-4816
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.3
Reporter: Joel Bernstein
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: RequestTask-removal.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch


 This issue adds the following enhancements to CloudSolrServer's update logic:
 1) Document routing: Updates are routed directly to the correct shard leader 
 eliminating document routing at the server.
 2) Optional parallel update execution: Updates for each shard are executed in 
 a separate thread so parallel indexing can occur across the cluster.
 These enhancements should allow for near linear scalability on indexing 
 throughput.
 Usage:
 CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
 cloudClient.setParallelUpdates(true); 
 SolrInputDocument doc1 = new SolrInputDocument();
 doc1.addField(id, 0);
 doc1.addField(a_t, hello1);
 SolrInputDocument doc2 = new SolrInputDocument();
 doc2.addField(id, 2);
 doc2.addField(a_t, hello2);
 UpdateRequest request = new UpdateRequest();
 request.add(doc1);
 request.add(doc2);
 request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
 NamedList response = cloudClient.request(request); // Returns a backwards 
 compatible condensed response.
 //To get more detailed response down cast to RouteResponse:
 CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4816) Add document routing to CloudSolrServer

2013-09-11 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated SOLR-4816:
--

Attachment: RequestTask-removal.patch

 Add document routing to CloudSolrServer
 ---

 Key: SOLR-4816
 URL: https://issues.apache.org/jira/browse/SOLR-4816
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.3
Reporter: Joel Bernstein
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: RequestTask-removal.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch


 This issue adds the following enhancements to CloudSolrServer's update logic:
 1) Document routing: Updates are routed directly to the correct shard leader 
 eliminating document routing at the server.
 2) Optional parallel update execution: Updates for each shard are executed in 
 a separate thread so parallel indexing can occur across the cluster.
 These enhancements should allow for near linear scalability on indexing 
 throughput.
 Usage:
 CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
 cloudClient.setParallelUpdates(true); 
 SolrInputDocument doc1 = new SolrInputDocument();
 doc1.addField(id, 0);
 doc1.addField(a_t, hello1);
 SolrInputDocument doc2 = new SolrInputDocument();
 doc2.addField(id, 2);
 doc2.addField(a_t, hello2);
 UpdateRequest request = new UpdateRequest();
 request.add(doc1);
 request.add(doc2);
 request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
 NamedList response = cloudClient.request(request); // Returns a backwards 
 compatible condensed response.
 //To get more detailed response down cast to RouteResponse:
 CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-09-11 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765114#comment-13765114
 ] 

Shikhar Bhushan commented on SOLR-4816:
---

We've run into some issues with CloudSolrServer leaking loads of 
LBHttpSolrServer's aliveCheckExecutor thread pools with {{parallelUpdates = 
true}}.

The root cause here is that the RequestTask inner class is creating a new 
LBHttpSolrServer for each run() rather than utilizing CloudSolrServer.lbServer 
which is already available to it.

Some detail: LBHttpSolrServer lazily initializes a single-threaded 
ScheduledExecutorService for the aliveCheckExecutor when e.g. there is some 
kind of error talking to a server. So this issue tends to come up when Solr 
nodes are unavailable and exceptions are thrown. There is also no call to 
shutdown() on that LBHttpSolrServer which gets created from RequestTask.run(). 
LBHttpSolrServer does have a finalizer that tries to shutdown the 
aliveCheckExecutor but there's no guarantee of finalizers executing (or maybe 
there is some other memory leak preventing that LBHttpSolrServer from being 
GC'ed at all).

So the one-liner fix that should definitely go in is to simply have RequestTask 
use CloudSolrServer.lbServer.

I have attached a patch that removes RequestTask altogether in favor of simply 
using Callable's and Future's which is much more idiomatic. 
(RequestTask-removal.patch)

 Add document routing to CloudSolrServer
 ---

 Key: SOLR-4816
 URL: https://issues.apache.org/jira/browse/SOLR-4816
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.3
Reporter: Joel Bernstein
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: RequestTask-removal.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch


 This issue adds the following enhancements to CloudSolrServer's update logic:
 1) Document routing: Updates are routed directly to the correct shard leader 
 eliminating document routing at the server.
 2) Optional parallel update execution: Updates for each shard are executed in 
 a separate thread so parallel indexing can occur across the cluster.
 These enhancements should allow for near linear scalability on indexing 
 throughput.
 Usage:
 CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
 cloudClient.setParallelUpdates(true); 
 SolrInputDocument doc1 = new SolrInputDocument();
 doc1.addField(id, 0);
 doc1.addField(a_t, hello1);
 SolrInputDocument doc2 = new SolrInputDocument();
 doc2.addField(id, 2);
 doc2.addField(a_t, hello2);
 UpdateRequest request = new UpdateRequest();
 request.add(doc1);
 request.add(doc2);
 request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
 NamedList response = cloudClient.request(request); // Returns a backwards 
 compatible condensed response.
 //To get more detailed response down cast to RouteResponse:
 CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-09-11 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765120#comment-13765120
 ] 

Shikhar Bhushan commented on SOLR-4816:
---

This is a separate issue but worth noting: CloudSolrServer.shutdown() does not 
call lbServer.shutdown()

In case the lbServer is provided as a constructor arg from outside that 
probably make sense.

But in case of the constructors where it is created internally, IMO 
CloudSolrServer should assume ownership and also shut it down.

 Add document routing to CloudSolrServer
 ---

 Key: SOLR-4816
 URL: https://issues.apache.org/jira/browse/SOLR-4816
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.3
Reporter: Joel Bernstein
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: RequestTask-removal.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
 SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch


 This issue adds the following enhancements to CloudSolrServer's update logic:
 1) Document routing: Updates are routed directly to the correct shard leader 
 eliminating document routing at the server.
 2) Optional parallel update execution: Updates for each shard are executed in 
 a separate thread so parallel indexing can occur across the cluster.
 These enhancements should allow for near linear scalability on indexing 
 throughput.
 Usage:
 CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
 cloudClient.setParallelUpdates(true); 
 SolrInputDocument doc1 = new SolrInputDocument();
 doc1.addField(id, 0);
 doc1.addField(a_t, hello1);
 SolrInputDocument doc2 = new SolrInputDocument();
 doc2.addField(id, 2);
 doc2.addField(a_t, hello2);
 UpdateRequest request = new UpdateRequest();
 request.add(doc1);
 request.add(doc2);
 request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
 NamedList response = cloudClient.request(request); // Returns a backwards 
 compatible condensed response.
 //To get more detailed response down cast to RouteResponse:
 CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4998) be more precise about IOContext for reads

2013-09-11 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765122#comment-13765122
 ] 

Shikhar Bhushan commented on LUCENE-4998:
-

[~mikemccand] maybe this patch is up your alley?

 be more precise about IOContext for reads
 -

 Key: LUCENE-4998
 URL: https://issues.apache.org/jira/browse/LUCENE-4998
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
Priority: Minor
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4998.patch


 Set the context as {{IOContext.READ}} / {{IOContext.READONCE}} where 
 applicable
 
 Motivation:
 Custom {{PostingsFormat}} may want to check the context on 
 {{SegmentReadState}} and branch differently, but for this to work properly 
 the context has to be specified correctly up the stack.
 For example, {{DirectPostingsFormat}} only loads postings into memory if the 
 {{context != MERGE}}. However a better condition would be {{context == 
 Context.READ  !context.readOnce}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3852) Admin UI - Cloud Tree with HTTP-Status 500 and an ArrayIndexOutOfBoundsException when using external ZK

2013-07-24 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719193#comment-13719193
 ] 

Shikhar Bhushan commented on SOLR-3852:
---

We've run into this. Was an ArrayIndexOutOfBoundsException arising out of: 
https://github.com/apache/lucene-solr/blob/4ce168a/solr/core/src/java/org/apache/solr/servlet/ZookeeperInfoServlet.java#L303

We have some znodes storing binary data but that bit above assumes that if a 
znode has data, it'll be a UTF-8 encoded string.

That block doesn't actually do anything post-decode, so maybe it should just be 
removed.

 Admin UI - Cloud Tree with HTTP-Status 500 and an 
 ArrayIndexOutOfBoundsException when using external ZK
 ---

 Key: SOLR-3852
 URL: https://issues.apache.org/jira/browse/SOLR-3852
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0-BETA
 Environment: Tomcat 6, external zookeeper-3.3.5 
Reporter: Vadim Kisselmann

 It works with embedded ZK.
 But when we use an external ZK(3.3.5), and this ZK has another nodes like 
 (hbase, broker, etc. and child-nodes with not specified formats) we get this 
 Error in Admin UI in the Cloud-Tree View: Loading of undefined failed with 
 HTTP-Status 500 .
 Important(!): The cluster still works. Our external ZK see the Solr Servers 
 (live-nodes) and has the solr config files from initial import. All the nodes 
 like collections, configs, overseer-elect are here.
 Only the Admin UI has a problem to show the Cloud-Tree. Cloud-Graph works!
 Catalina-LogFiles are free from Error messages, i have only this stack trace 
 from Firebug:
 htmlheadtitleApache Tomcat/6.0.28 - Error report/titlestyle!--H1 
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;}
  H2 
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;}
  H3 
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;}
  BODY 
 {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B 
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P 
 {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A
  {color : black;}A.name {color : black;}HR {color : #525D76;}--/style 
 /headbodyh1HTTP Status 500 - /h1HR size=1 
 noshade=noshadepbtype/b Exception report/ppbmessage/b 
 u/u/ppbdescription/b uThe server encountered an internal error 
 () that prevented it from fulfilling this request./u/ppbexception/b 
 prejava.lang.ArrayIndexOutOfBoundsException
 /pre/ppbnote/b uThe full stack trace of the root cause is 
 available in the Apache Tomcat/6.0.28 logs./u/pHR size=1 
 noshade=noshadeh3Apache Tomcat/6.0.28/h3/body/html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4379) solr-core has a dependency to slf4j-jdk14 and is not binding agnostic

2013-06-26 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13694276#comment-13694276
 ] 

Shikhar Bhushan commented on SOLR-4379:
---

This is quite problematic, if you are transitively pulling in {{solr-core}}'s 
dependencies you'd get {{slf4j-jdk14}} on the classpath, and slf4j complains 
(not to speak of the potential for it selecting the wrong binding):

{noformat}
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:redacted/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:redacted/lib/slf4j-jdk14-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
{noformat}

The fix should I think be as simple as adding {{scoperuntime/scope}} for 
{{slf4j-jdk14}} in the solr-core POM.

 solr-core has a dependency to slf4j-jdk14 and is not binding agnostic
 -

 Key: SOLR-4379
 URL: https://issues.apache.org/jira/browse/SOLR-4379
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.1
Reporter: Nicolas Labrot
Priority: Minor

 solr-core can be used as a dependency in other projects which used others 
 binding. In these cases slf4j-jdk14 must be excluded
 In my opinion it may be better to move the slf4j-jdk14 dependency from 
 solr-core to the war project. 
 solr-core will be binding agnostics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4379) solr-core has a dependency to slf4j-jdk14 and is not binding agnostic

2013-06-26 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13694276#comment-13694276
 ] 

Shikhar Bhushan edited comment on SOLR-4379 at 6/26/13 9:52 PM:


This is quite problematic, if you are transitively pulling in {{solr-core}}'s 
dependencies you'd get {{slf4j-jdk14}} on the classpath, and slf4j complains 
(not to speak of the potential for it selecting the wrong binding):

{noformat}
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:redacted/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:redacted/lib/slf4j-jdk14-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
{noformat}

The fix should I think be as simple as adding {{scopeprovided/scope}} for 
{{slf4j-jdk14}} in the solr-core POM.

  was (Author: shikhar):
This is quite problematic, if you are transitively pulling in 
{{solr-core}}'s dependencies you'd get {{slf4j-jdk14}} on the classpath, and 
slf4j complains (not to speak of the potential for it selecting the wrong 
binding):

{noformat}
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:redacted/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:redacted/lib/slf4j-jdk14-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
{noformat}

The fix should I think be as simple as adding {{scoperuntime/scope}} for 
{{slf4j-jdk14}} in the solr-core POM.
  
 solr-core has a dependency to slf4j-jdk14 and is not binding agnostic
 -

 Key: SOLR-4379
 URL: https://issues.apache.org/jira/browse/SOLR-4379
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.1
Reporter: Nicolas Labrot
Priority: Minor

 solr-core can be used as a dependency in other projects which used others 
 binding. In these cases slf4j-jdk14 must be excluded
 In my opinion it may be better to move the slf4j-jdk14 dependency from 
 solr-core to the war project. 
 solr-core will be binding agnostics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4379) solr-core has a dependency to slf4j-jdk14 and is not binding agnostic

2013-06-26 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13694276#comment-13694276
 ] 

Shikhar Bhushan edited comment on SOLR-4379 at 6/26/13 9:55 PM:


This is quite problematic, if you are transitively pulling in {{solr-core}}'s 
dependencies you'd get {{slf4j-jdk14}} on the classpath, and slf4j complains 
(not to speak of the potential for it selecting the wrong binding):

{noformat}
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:redacted/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:redacted/lib/slf4j-jdk14-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
{noformat}

The fix should I think be as simple as adding {{scoperuntime/scope}} (or 
{{scopeprovided/scope}}? -- not sure) for {{slf4j-jdk14}} in the solr-core 
POM.

  was (Author: shikhar):
This is quite problematic, if you are transitively pulling in 
{{solr-core}}'s dependencies you'd get {{slf4j-jdk14}} on the classpath, and 
slf4j complains (not to speak of the potential for it selecting the wrong 
binding):

{noformat}
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:redacted/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:redacted/lib/slf4j-jdk14-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
{noformat}

The fix should I think be as simple as adding {{scopeprovided/scope}} for 
{{slf4j-jdk14}} in the solr-core POM.
  
 solr-core has a dependency to slf4j-jdk14 and is not binding agnostic
 -

 Key: SOLR-4379
 URL: https://issues.apache.org/jira/browse/SOLR-4379
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.1
Reporter: Nicolas Labrot
Priority: Minor

 solr-core can be used as a dependency in other projects which used others 
 binding. In these cases slf4j-jdk14 must be excluded
 In my opinion it may be better to move the slf4j-jdk14 dependency from 
 solr-core to the war project. 
 solr-core will be binding agnostics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4379) solr-core has a dependency to slf4j-jdk14 and is not binding agnostic

2013-06-26 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13694285#comment-13694285
 ] 

Shikhar Bhushan commented on SOLR-4379:
---

looks like this is fixed in 4.3.1 at least: 
http://repo1.maven.org/maven2/org/apache/solr/solr-core/4.3.1/solr-core-4.3.1.pom
 has no mention of {{slf4j-jdk14}}

 solr-core has a dependency to slf4j-jdk14 and is not binding agnostic
 -

 Key: SOLR-4379
 URL: https://issues.apache.org/jira/browse/SOLR-4379
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.1
Reporter: Nicolas Labrot
Priority: Minor

 solr-core can be used as a dependency in other projects which used others 
 binding. In these cases slf4j-jdk14 must be excluded
 In my opinion it may be better to move the slf4j-jdk14 dependency from 
 solr-core to the war project. 
 solr-core will be binding agnostics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4379) solr-core has a dependency to slf4j-jdk14 and is not binding agnostic

2013-06-26 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13694290#comment-13694290
 ] 

Shikhar Bhushan commented on SOLR-4379:
---

context http://wiki.apache.org/solr/SolrLogging#Solr_4.3_and_above

 solr-core has a dependency to slf4j-jdk14 and is not binding agnostic
 -

 Key: SOLR-4379
 URL: https://issues.apache.org/jira/browse/SOLR-4379
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.1
Reporter: Nicolas Labrot
Priority: Minor

 solr-core can be used as a dependency in other projects which used others 
 binding. In these cases slf4j-jdk14 must be excluded
 In my opinion it may be better to move the slf4j-jdk14 dependency from 
 solr-core to the war project. 
 solr-core will be binding agnostics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4998) be more precise about IOContext for reads

2013-05-13 Thread Shikhar Bhushan (JIRA)
Shikhar Bhushan created LUCENE-4998:
---

 Summary: be more precise about IOContext for reads
 Key: LUCENE-4998
 URL: https://issues.apache.org/jira/browse/LUCENE-4998
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
Priority: Minor
 Fix For: 5.0, 4.4
 Attachments: LUCENE-4998.patch

Set the context as {{IOContext.READ}} / {{IOContext.READONCE}} where applicable



Motivation:

Custom {{PostingsFormat}} may want to check the context on {{SegmentReadState}} 
and branch differently, but for this to work properly the context has to be 
specified correctly up the stack.

For example, {{DirectPostingsFormat}} only loads postings into memory if the 
{{context != MERGE}}. However a better condition would be {{context == 
Context.READ  !context.readOnce}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4998) be more precise about IOContext for reads

2013-05-13 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated LUCENE-4998:


Attachment: LUCENE-4998.patch

 be more precise about IOContext for reads
 -

 Key: LUCENE-4998
 URL: https://issues.apache.org/jira/browse/LUCENE-4998
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shikhar Bhushan
Priority: Minor
 Fix For: 5.0, 4.4

 Attachments: LUCENE-4998.patch


 Set the context as {{IOContext.READ}} / {{IOContext.READONCE}} where 
 applicable
 
 Motivation:
 Custom {{PostingsFormat}} may want to check the context on 
 {{SegmentReadState}} and branch differently, but for this to work properly 
 the context has to be specified correctly up the stack.
 For example, {{DirectPostingsFormat}} only loads postings into memory if the 
 {{context != MERGE}}. However a better condition would be {{context == 
 Context.READ  !context.readOnce}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4471) slave index version is different form master (full index copy needed)

2013-02-22 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584719#comment-13584719
 ] 

Shikhar Bhushan commented on SOLR-4471:
---

We ran into the same issue with full copies happening with each incremental. 
Patch works! Thank you.

 slave index version is different form master (full index copy needed)
 -

 Key: SOLR-4471
 URL: https://issues.apache.org/jira/browse/SOLR-4471
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.1
Reporter: Andre Charton
Assignee: Mark Miller
  Labels: master, replication, slave, version
 Fix For: 4.2, 5.0

 Attachments: SOLR-4471.patch, SOLR-4471.patch, SOLR-4471.patch, 
 SOLR-4471_TestRefactor.diff, SOLR-4471_Tests.patch


 Scenario: master/slave replication, master delta index runs every 10 minutes, 
 slave poll interval is 10 sec.
 There was an issue SOLR-4413 - slave reads index from wrong directory, so 
 slave is full copy index from master every time, which is fixed after 
 applying this patch from 4413 (see script below).
 Now on replication the slave downloads only updated files, but slave is 
 create a new segement file and also a new version of index (generation is 
 identical with master). On next polling the slave is download the full index 
 again, because the new version slave is force a full copy.
 Problem is the new version of index on the slave after first replication.
 {noformat:apply patch SOLR-4413 script, please copy patch into patches 
 directory before useage.}
 mkdir work
 cd work
 svn co http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_1/
 cd lucene_solr_4_1
 patch -p0  ../../patches/SOLR-4413.patch
 cd solr
 ant dist
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org