[jira] [Commented] (LUCENE-6482) Class loading deadlock relating to Codec initialization, default codec and SPI discovery
[ https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578288#comment-14578288 ] Shikhar Bhushan commented on LUCENE-6482: - Thanks for fixing this [~thetaphi]! Great digging on what was going on. The fix and the test looks good to me. > Class loading deadlock relating to Codec initialization, default codec and > SPI discovery > > > Key: LUCENE-6482 > URL: https://issues.apache.org/jira/browse/LUCENE-6482 > Project: Lucene - Core > Issue Type: Bug > Components: core/codecs >Affects Versions: 4.9.1 >Reporter: Shikhar Bhushan >Assignee: Uwe Schindler >Priority: Critical > Fix For: Trunk, 5.3, 5.2.1 > > Attachments: CodecLoadingDeadlockTest.java, > LUCENE-6482-failingtest.patch, LUCENE-6482-failingtest.patch, > LUCENE-6482.patch, LUCENE-6482.patch, LUCENE-6482.patch, LUCENE-6482.patch, > LUCENE-6482.patch > > > This issue came up for us several times with Elasticsearch 1.3.4 (Lucene > 4.9.1), with many threads seeming deadlocked but RUNNABLE: > {noformat} > "elasticsearch[search77-es2][generic][T#43]" #160 daemon prio=5 os_prio=0 > tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000] >java.lang.Thread.State: RUNNABLE > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) > at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453) > at > org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98) > at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126) > at org.elasticsearch.index.store.Store.access$300(Store.java:76) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:456) > at > org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268) > at > org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > It didn't really make sense to see RUNNABLE threads in Object.wait(), but > this seems to be symptomatic of deadlocks in static initialization > (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). > I found LUCENE-5573 as an instance of this having come up with Lucene code > before. > I'm not sure what exactly is going on, but the deadlock in this case seems to > involve these threads: > {noformat} > "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon > prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() > [0x7f79daed8000] >java.lang.Thread.State: RUNNABLE > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at java.lang.Class.newInstance(Class.java:433) > at org.apache.lucene.util.Name
[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader
[ https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-6482: Attachment: CodecLoadingDeadlockTest.java > Class loading deadlock relating to NamedSPILoader > - > > Key: LUCENE-6482 > URL: https://issues.apache.org/jira/browse/LUCENE-6482 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.9.1 > Reporter: Shikhar Bhushan >Assignee: Uwe Schindler > Attachments: CodecLoadingDeadlockTest.java > > > This issue came up for us several times with Elasticsearch 1.3.4 (Lucene > 4.9.1), with many threads seeming deadlocked but RUNNABLE: > {noformat} > "elasticsearch[search77-es2][generic][T#43]" #160 daemon prio=5 os_prio=0 > tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000] >java.lang.Thread.State: RUNNABLE > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) > at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453) > at > org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98) > at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126) > at org.elasticsearch.index.store.Store.access$300(Store.java:76) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:456) > at > org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268) > at > org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > It didn't really make sense to see RUNNABLE threads in Object.wait(), but > this seems to be symptomatic of deadlocks in static initialization > (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). > I found LUCENE-5573 as an instance of this having come up with Lucene code > before. > I'm not sure what exactly is going on, but the deadlock in this case seems to > involve these threads: > {noformat} > "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon > prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() > [0x7f79daed8000] >java.lang.Thread.State: RUNNABLE > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at java.lang.Class.newInstance(Class.java:433) > at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67) > - locked <0x00061fef4968> (a org.apache.lucene.util.NamedSPILoader) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:47) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:37) > at > org.apache.lucene.codecs.PostingsFormat.(PostingsFormat.java:44) > at > org.elasticsearch.index.codec.postingsformat.PostingFormats.(PostingFormats.java:67) > at > org.elasticsearch.index.cod
[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader
[ https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-6482: Attachment: (was: CodecLoadingDeadlockTest.java) > Class loading deadlock relating to NamedSPILoader > - > > Key: LUCENE-6482 > URL: https://issues.apache.org/jira/browse/LUCENE-6482 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.9.1 > Reporter: Shikhar Bhushan >Assignee: Uwe Schindler > Attachments: CodecLoadingDeadlockTest.java > > > This issue came up for us several times with Elasticsearch 1.3.4 (Lucene > 4.9.1), with many threads seeming deadlocked but RUNNABLE: > {noformat} > "elasticsearch[search77-es2][generic][T#43]" #160 daemon prio=5 os_prio=0 > tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000] >java.lang.Thread.State: RUNNABLE > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) > at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453) > at > org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98) > at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126) > at org.elasticsearch.index.store.Store.access$300(Store.java:76) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:456) > at > org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268) > at > org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > It didn't really make sense to see RUNNABLE threads in Object.wait(), but > this seems to be symptomatic of deadlocks in static initialization > (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). > I found LUCENE-5573 as an instance of this having come up with Lucene code > before. > I'm not sure what exactly is going on, but the deadlock in this case seems to > involve these threads: > {noformat} > "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon > prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() > [0x7f79daed8000] >java.lang.Thread.State: RUNNABLE > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at java.lang.Class.newInstance(Class.java:433) > at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67) > - locked <0x00061fef4968> (a org.apache.lucene.util.NamedSPILoader) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:47) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:37) > at > org.apache.lucene.codecs.PostingsFormat.(PostingsFormat.java:44) > at > org.elasticsearch.index.codec.postingsformat.PostingFormats.(PostingFormats.java:67) > at > org.elasticsearch.index.cod
[jira] [Commented] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader
[ https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575639#comment-14575639 ] Shikhar Bhushan commented on LUCENE-6482: - Thanks Uwe. I have actually not had a single occasion of not encountering the deadlock, just these lines do the trick every time {noformat} public static void main(String... args) { final Thread t1 = new Thread(() -> Codec.getDefault()); final Thread t2 = new Thread(() -> new SimpleTextCodec()); t1.start(); t2.start(); } {noformat} I am using JDK8u25. > Class loading deadlock relating to NamedSPILoader > - > > Key: LUCENE-6482 > URL: https://issues.apache.org/jira/browse/LUCENE-6482 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.9.1 >Reporter: Shikhar Bhushan >Assignee: Uwe Schindler > Attachments: CodecLoadingDeadlockTest.java > > > This issue came up for us several times with Elasticsearch 1.3.4 (Lucene > 4.9.1), with many threads seeming deadlocked but RUNNABLE: > {noformat} > "elasticsearch[search77-es2][generic][T#43]" #160 daemon prio=5 os_prio=0 > tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000] >java.lang.Thread.State: RUNNABLE > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) > at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453) > at > org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98) > at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126) > at org.elasticsearch.index.store.Store.access$300(Store.java:76) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:456) > at > org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268) > at > org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > It didn't really make sense to see RUNNABLE threads in Object.wait(), but > this seems to be symptomatic of deadlocks in static initialization > (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). > I found LUCENE-5573 as an instance of this having come up with Lucene code > before. > I'm not sure what exactly is going on, but the deadlock in this case seems to > involve these threads: > {noformat} > "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon > prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() > [0x7f79daed8000] >java.lang.Thread.State: RUNNABLE > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at java.lang.Class.newInstance(Class.java:433) > at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67) > - locked <0x00061fef4968> (a org.a
[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader
[ https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-6482: Attachment: CodecLoadingDeadlockTest.java > Class loading deadlock relating to NamedSPILoader > - > > Key: LUCENE-6482 > URL: https://issues.apache.org/jira/browse/LUCENE-6482 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.9.1 > Reporter: Shikhar Bhushan >Assignee: Uwe Schindler > Attachments: CodecLoadingDeadlockTest.java > > > This issue came up for us several times with Elasticsearch 1.3.4 (Lucene > 4.9.1), with many threads seeming deadlocked but RUNNABLE: > {noformat} > "elasticsearch[search77-es2][generic][T#43]" #160 daemon prio=5 os_prio=0 > tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000] >java.lang.Thread.State: RUNNABLE > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) > at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453) > at > org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98) > at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126) > at org.elasticsearch.index.store.Store.access$300(Store.java:76) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:456) > at > org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268) > at > org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > It didn't really make sense to see RUNNABLE threads in Object.wait(), but > this seems to be symptomatic of deadlocks in static initialization > (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). > I found LUCENE-5573 as an instance of this having come up with Lucene code > before. > I'm not sure what exactly is going on, but the deadlock in this case seems to > involve these threads: > {noformat} > "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon > prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() > [0x7f79daed8000] >java.lang.Thread.State: RUNNABLE > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at java.lang.Class.newInstance(Class.java:433) > at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67) > - locked <0x00061fef4968> (a org.apache.lucene.util.NamedSPILoader) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:47) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:37) > at > org.apache.lucene.codecs.PostingsFormat.(PostingsFormat.java:44) > at > org.elasticsearch.index.codec.postingsformat.PostingFormats.(PostingFormats.java:67) > at > org.elasticsearch.index.cod
[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader
[ https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-6482: Attachment: (was: CodecLoadingDeadlockTest.java) > Class loading deadlock relating to NamedSPILoader > - > > Key: LUCENE-6482 > URL: https://issues.apache.org/jira/browse/LUCENE-6482 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.9.1 > Reporter: Shikhar Bhushan >Assignee: Uwe Schindler > Attachments: CodecLoadingDeadlockTest.java > > > This issue came up for us several times with Elasticsearch 1.3.4 (Lucene > 4.9.1), with many threads seeming deadlocked but RUNNABLE: > {noformat} > "elasticsearch[search77-es2][generic][T#43]" #160 daemon prio=5 os_prio=0 > tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000] >java.lang.Thread.State: RUNNABLE > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) > at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453) > at > org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98) > at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126) > at org.elasticsearch.index.store.Store.access$300(Store.java:76) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:456) > at > org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268) > at > org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > It didn't really make sense to see RUNNABLE threads in Object.wait(), but > this seems to be symptomatic of deadlocks in static initialization > (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). > I found LUCENE-5573 as an instance of this having come up with Lucene code > before. > I'm not sure what exactly is going on, but the deadlock in this case seems to > involve these threads: > {noformat} > "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon > prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() > [0x7f79daed8000] >java.lang.Thread.State: RUNNABLE > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at java.lang.Class.newInstance(Class.java:433) > at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67) > - locked <0x00061fef4968> (a org.apache.lucene.util.NamedSPILoader) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:47) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:37) > at > org.apache.lucene.codecs.PostingsFormat.(PostingsFormat.java:44) > at > org.elasticsearch.index.codec.postingsformat.PostingFormats.(PostingFormats.java:67) > at > org.elasticsearch.index.cod
[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader
[ https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-6482: Attachment: CodecLoadingDeadlockTest.java [~thetaphi] I have had some luck reproducing the problem quite consistently with the attached test. If you uncomment the first line in the main() so that Codec is previously initialized before the threads start, the deadlock doesn't happen. > Class loading deadlock relating to NamedSPILoader > - > > Key: LUCENE-6482 > URL: https://issues.apache.org/jira/browse/LUCENE-6482 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.9.1 >Reporter: Shikhar Bhushan >Assignee: Uwe Schindler > Attachments: CodecLoadingDeadlockTest.java > > > This issue came up for us several times with Elasticsearch 1.3.4 (Lucene > 4.9.1), with many threads seeming deadlocked but RUNNABLE: > {noformat} > "elasticsearch[search77-es2][generic][T#43]" #160 daemon prio=5 os_prio=0 > tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000] >java.lang.Thread.State: RUNNABLE > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) > at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453) > at > org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98) > at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126) > at org.elasticsearch.index.store.Store.access$300(Store.java:76) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:456) > at > org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268) > at > org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > It didn't really make sense to see RUNNABLE threads in Object.wait(), but > this seems to be symptomatic of deadlocks in static initialization > (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). > I found LUCENE-5573 as an instance of this having come up with Lucene code > before. > I'm not sure what exactly is going on, but the deadlock in this case seems to > involve these threads: > {noformat} > "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon > prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() > [0x7f79daed8000] >java.lang.Thread.State: RUNNABLE > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at java.lang.Class.newInstance(Class.java:433) > at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67) > - locked <0x00061fef4968> (a org.apache.lucene.util.NamedSPILoader) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:47) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:37) >
[jira] [Reopened] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader
[ https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan reopened LUCENE-6482: - Reopening as per discussion in https://github.com/elastic/elasticsearch/issues/11170 > Class loading deadlock relating to NamedSPILoader > - > > Key: LUCENE-6482 > URL: https://issues.apache.org/jira/browse/LUCENE-6482 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.9.1 > Reporter: Shikhar Bhushan >Assignee: Uwe Schindler > > This issue came up for us several times with Elasticsearch 1.3.4 (Lucene > 4.9.1), with many threads seeming deadlocked but RUNNABLE: > {noformat} > "elasticsearch[search77-es2][generic][T#43]" #160 daemon prio=5 os_prio=0 > tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000] >java.lang.Thread.State: RUNNABLE > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) > at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453) > at > org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98) > at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126) > at org.elasticsearch.index.store.Store.access$300(Store.java:76) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:456) > at > org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268) > at > org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > It didn't really make sense to see RUNNABLE threads in Object.wait(), but > this seems to be symptomatic of deadlocks in static initialization > (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). > I found LUCENE-5573 as an instance of this having come up with Lucene code > before. > I'm not sure what exactly is going on, but the deadlock in this case seems to > involve these threads: > {noformat} > "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon > prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() > [0x7f79daed8000] >java.lang.Thread.State: RUNNABLE > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at java.lang.Class.newInstance(Class.java:433) > at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67) > - locked <0x00061fef4968> (a org.apache.lucene.util.NamedSPILoader) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:47) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:37) > at > org.apache.lucene.codecs.PostingsFormat.(PostingsFormat.java:44) > at > org.elasticsearch.index.codec.postingsformat.PostingFormats.(PostingFormats.java:67) > at > org.elasticsearch.index.codec.Cod
[jira] [Closed] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader
[ https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan closed LUCENE-6482. --- Resolution: Not A Problem Thanks [~thetaphi], makes sense and it does not seem like a Lucene issue, so I'll close this. It might have been due to using a custom Elasticsearch discovery plugin which is purely asynchronous that those 2 bits ended up happening in parallel, and caused the deadlock. > Class loading deadlock relating to NamedSPILoader > - > > Key: LUCENE-6482 > URL: https://issues.apache.org/jira/browse/LUCENE-6482 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.9.1 >Reporter: Shikhar Bhushan >Assignee: Uwe Schindler > > This issue came up for us several times with Elasticsearch 1.3.4 (Lucene > 4.9.1), with many threads seeming deadlocked but RUNNABLE: > {noformat} > "elasticsearch[search77-es2][generic][T#43]" #160 daemon prio=5 os_prio=0 > tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000] >java.lang.Thread.State: RUNNABLE > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) > at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453) > at > org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98) > at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126) > at org.elasticsearch.index.store.Store.access$300(Store.java:76) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:456) > at > org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268) > at > org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > It didn't really make sense to see RUNNABLE threads in Object.wait(), but > this seems to be symptomatic of deadlocks in static initialization > (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). > I found LUCENE-5573 as an instance of this having come up with Lucene code > before. > I'm not sure what exactly is going on, but the deadlock in this case seems to > involve these threads: > {noformat} > "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon > prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() > [0x7f79daed8000] >java.lang.Thread.State: RUNNABLE > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at java.lang.Class.newInstance(Class.java:433) > at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67) > - locked <0x00061fef4968> (a org.apache.lucene.util.NamedSPILoader) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:47) > at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:37) > at > org.a
[jira] [Comment Edited] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader
[ https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544349#comment-14544349 ] Shikhar Bhushan edited comment on LUCENE-6482 at 5/14/15 8:57 PM: -- [~thetaphi] This was seen on JDK8u5, but I think this has also happened on JDK8u25 (not certain though...). The issue is not deterministic and comes up during cluster bounces sometimes, so it's hard to say whether an ES upgrade fixes it. You're probably right that this has nothing to do with NamedSPILoader but the classes being loaded. Is it possible to conclude from the thread dump whether an ES or Lucene Codec/PostingsFormat/etc is involved? was (Author: shikhar): [~thetaphi] This was seen on JDK8u5, but I think this has also happened on JDK8u25 (not certain though...). The issue is not deterministic and comes up during cluster bounces sometimes, so it's hard to say whether an upgrade fixes it. You're probably right that this has nothing to do with NamedSPILoader but the classes being loaded. Is that possible to conclude from the thread dump whether it is an ES or Lucene Codec / PostingsFormat is involved? > Class loading deadlock relating to NamedSPILoader > - > > Key: LUCENE-6482 > URL: https://issues.apache.org/jira/browse/LUCENE-6482 > Project: Lucene - Core > Issue Type: Bug > Affects Versions: 4.9.1 >Reporter: Shikhar Bhushan >Assignee: Uwe Schindler > > This issue came up for us several times with Elasticsearch 1.3.4 (Lucene > 4.9.1), with many threads seeming deadlocked but RUNNABLE: > {noformat} > "elasticsearch[search77-es2][generic][T#43]" #160 daemon prio=5 os_prio=0 > tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000] >java.lang.Thread.State: RUNNABLE > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) > at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453) > at > org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98) > at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126) > at org.elasticsearch.index.store.Store.access$300(Store.java:76) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:456) > at > org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268) > at > org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > It didn't really make sense to see RUNNABLE threads in Object.wait(), but > this seems to be symptomatic of deadlocks in static initialization > (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). > I found LUCENE-5573 as an instance of this having come up with Lucene code > before. > I'm not sure what exactly is going on, but the deadlock in this case seems to > involve these threads: > {noformat} > "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon > prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() > [0x7f79daed8000] >java.lang.Thread.State: RUNNABLE > at sun.reflect.NativeConstructorAccessorImpl.newInstance
[jira] [Commented] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader
[ https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544349#comment-14544349 ] Shikhar Bhushan commented on LUCENE-6482: - [~thetaphi] This was seen on JDK8u5, but I think this has also happened on JDK8u25 (not certain though...). The issue is not deterministic and comes up during cluster bounces sometimes, so it's hard to say whether an upgrade fixes it. You're probably right that this has nothing to do with NamedSPILoader but the classes being loaded. Is that possible to conclude from the thread dump whether it is an ES or Lucene Codec / PostingsFormat is involved? > Class loading deadlock relating to NamedSPILoader > - > > Key: LUCENE-6482 > URL: https://issues.apache.org/jira/browse/LUCENE-6482 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.9.1 >Reporter: Shikhar Bhushan >Assignee: Uwe Schindler > > This issue came up for us several times with Elasticsearch 1.3.4 (Lucene > 4.9.1), with many threads seeming deadlocked but RUNNABLE: > {noformat} > "elasticsearch[search77-es2][generic][T#43]" #160 daemon prio=5 os_prio=0 > tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000] >java.lang.Thread.State: RUNNABLE > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) > at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453) > at > org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98) > at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126) > at org.elasticsearch.index.store.Store.access$300(Store.java:76) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465) > at > org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:456) > at > org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140) > at > org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277) > at > org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268) > at > org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > It didn't really make sense to see RUNNABLE threads in Object.wait(), but > this seems to be symptomatic of deadlocks in static initialization > (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). > I found LUCENE-5573 as an instance of this having come up with Lucene code > before. > I'm not sure what exactly is going on, but the deadlock in this case seems to > involve these threads: > {noformat} > "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon > prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() > [0x7f79daed8000] >java.lang.Thread.State: RUNNABLE > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at java.lang.Class.newInstance(Class.java:433) > at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67) > - locked <0x00061fef4968> (a org.apa
[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader
[ https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-6482: Description: This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 4.9.1), with many threads seeming deadlocked but RUNNABLE: {noformat} "elasticsearch[search77-es2][generic][T#43]" #160 daemon prio=5 os_prio=0 tid=0x7f79180c5800 nid=0x3d1f in Object.wait() [0x7f79d9289000] java.lang.Thread.State: RUNNABLE at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:457) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:912) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:758) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:453) at org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:98) at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:126) at org.elasticsearch.index.store.Store.access$300(Store.java:76) at org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:465) at org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:456) at org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:281) at org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.listStoreMetaData(TransportNodesListShardStoreMetaData.java:186) at org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:140) at org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:61) at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:277) at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:268) at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} It didn't really make sense to see RUNNABLE threads in Object.wait(), but this seems to be symptomatic of deadlocks in static initialization (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). I found LUCENE-5573 as an instance of this having come up with Lucene code before. I'm not sure what exactly is going on, but the deadlock in this case seems to involve these threads: {noformat} "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() [0x7f79daed8000] java.lang.Thread.State: RUNNABLE at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at java.lang.Class.newInstance(Class.java:433) at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67) - locked <0x00061fef4968> (a org.apache.lucene.util.NamedSPILoader) at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:47) at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:37) at org.apache.lucene.codecs.PostingsFormat.(PostingsFormat.java:44) at org.elasticsearch.index.codec.postingsformat.PostingFormats.(PostingFormats.java:67) at org.elasticsearch.index.codec.CodecModule.configurePostingsFormats(CodecModule.java:126) at org.elasticsearch.index.codec.CodecModule.configure(CodecModule.java:178) at org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60) - locked <0x00061fef49e8> (a org.elasticsearch.index.codec.CodecModule) at org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:204) at org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:85) at org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130) at org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:99) - locked <0x00061fef4c10> (a org.elastic
[jira] [Updated] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader
[ https://issues.apache.org/jira/browse/LUCENE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-6482: Description: This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 4.9.1), with many threads seeming deadlocked but RUNNABLE: {noformat} "elasticsearch[blabla-es0][clusterService#updateTask][T#1]" #79 daemon prio=5 os_prio=0 tid=0x7fd16988d000 nid=0x6e01 waiting on condition [0x7fd0bc279000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000614a22508> (a org.elasticsearch.common.util.concurrent.BaseFuture$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:274) at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:113) at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:45) at org.elasticsearch.gateway.local.LocalGatewayAllocator.buildShardStores(LocalGatewayAllocator.java:443) at org.elasticsearch.gateway.local.LocalGatewayAllocator.allocateUnassigned(LocalGatewayAllocator.java:281) at org.elasticsearch.cluster.routing.allocation.allocator.ShardsAllocators.allocateUnassigned(ShardsAllocators.java:74) at org.elasticsearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:217) at org.elasticsearch.cluster.routing.allocation.AllocationService.applyStartedShards(AllocationService.java:86) at org.elasticsearch.cluster.action.shard.ShardStateAction$4.execute(ShardStateAction.java:278) at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:328) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} It didn't really make sense to see RUNNABLE threads in Object.wait(), but this seems to be symptomatic of deadlocks in static initialization (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). I found LUCENE-5573 as an instance of this having come up with Lucene code before. I'm not sure what exactly is going on, but the deadlock in this case seems to involve these threads: {noformat} "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() [0x7f79daed8000] java.lang.Thread.State: RUNNABLE at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at java.lang.Class.newInstance(Class.java:433) at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67) - locked <0x00061fef4968> (a org.apache.lucene.util.NamedSPILoader) at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:47) at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:37) at org.apache.lucene.codecs.PostingsFormat.(PostingsFormat.java:44) at org.elasticsearch.index.codec.postingsformat.PostingFormats.(PostingFormats.java:67) at org.elasticsearch.index.codec.CodecModule.configurePostingsFormats(CodecModule.java:126) at org.elasticsearch.index.codec.CodecModule.configure(CodecModule.java:178) at org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60) - locked <0x00061fef49e8> (a org.elasticsearch.index.codec.CodecModule) at org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:204) at org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:85) at org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130) at org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:99) - locked <0x00061fef4c10&g
[jira] [Created] (LUCENE-6482) Class loading deadlock relating to NamedSPILoader
Shikhar Bhushan created LUCENE-6482: --- Summary: Class loading deadlock relating to NamedSPILoader Key: LUCENE-6482 URL: https://issues.apache.org/jira/browse/LUCENE-6482 Project: Lucene - Core Issue Type: Bug Reporter: Shikhar Bhushan This issue came up for us several times with Elasticsearch 1.3.4 (Lucene 4.9.1), seeing many threads seeming deadlocked but RUNNABLE: {noformat} "elasticsearch[blabla-es0][clusterService#updateTask][T#1]" #79 daemon prio=5 os_prio=0 tid=0x7fd16988d000 nid=0x6e01 waiting on condition [0x7fd0bc279000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000614a22508> (a org.elasticsearch.common.util.concurrent.BaseFuture$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:274) at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:113) at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:45) at org.elasticsearch.gateway.local.LocalGatewayAllocator.buildShardStores(LocalGatewayAllocator.java:443) at org.elasticsearch.gateway.local.LocalGatewayAllocator.allocateUnassigned(LocalGatewayAllocator.java:281) at org.elasticsearch.cluster.routing.allocation.allocator.ShardsAllocators.allocateUnassigned(ShardsAllocators.java:74) at org.elasticsearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:217) at org.elasticsearch.cluster.routing.allocation.AllocationService.applyStartedShards(AllocationService.java:86) at org.elasticsearch.cluster.action.shard.ShardStateAction$4.execute(ShardStateAction.java:278) at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:328) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} It didn't really make sense to see RUNNABLE threads in Object.wait(), but this seems to be symptomatic of deadlocks in static initialization (http://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html). I found LUCENE-5573 as an instance of this having come up with Lucene code before. I'm not sure what exactly is going on, but the deadlock in this case seems to involve these threads: {noformat} "elasticsearch[search77-es2][clusterService#updateTask][T#1]" #79 daemon prio=5 os_prio=0 tid=0x7f7b155ff800 nid=0xd49 in Object.wait() [0x7f79daed8000] java.lang.Thread.State: RUNNABLE at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at java.lang.Class.newInstance(Class.java:433) at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:67) - locked <0x00061fef4968> (a org.apache.lucene.util.NamedSPILoader) at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:47) at org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:37) at org.apache.lucene.codecs.PostingsFormat.(PostingsFormat.java:44) at org.elasticsearch.index.codec.postingsformat.PostingFormats.(PostingFormats.java:67) at org.elasticsearch.index.codec.CodecModule.configurePostingsFormats(CodecModule.java:126) at org.elasticsearch.index.codec.CodecModule.configure(CodecModule.java:178) at org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60) - locked <0x00061fef49e8> (a org.elasticsearch.index.codec.CodecModule) at org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:204) at org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:85) at org.elasticsearch.common.inject.InjectorShell$
[jira] [Comment Edited] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution
[ https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340661#comment-14340661 ] Shikhar Bhushan edited comment on LUCENE-6294 at 2/27/15 7:25 PM: -- When slicing differently than segment-per-slice, it'd probably be desirable to distribute segments by size across the slices, rather than all large segments ending up in one slice to be searched sequentially. was (Author: shikhar): When slicing differnetly than segment-per-slice, it'd probably be desirable to distribute segments by size across the slices, rather than all large segments ending up in one slice to be searched sequentially. > Generalize how IndexSearcher parallelizes collection execution > -- > > Key: LUCENE-6294 > URL: https://issues.apache.org/jira/browse/LUCENE-6294 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Trivial > Fix For: Trunk, 5.1 > > Attachments: LUCENE-6294.patch > > > IndexSearcher takes an ExecutorService that can be used to parallelize > collection execution. This is useful if you want to trade throughput for > latency. > However, this executor service will only be used if you search for top docs. > In that case, we will create one collector per slide and call TopDocs.merge > in the end. If you use search(Query, Collector), the executor service will > never be used. > But there are other collectors that could work the same way as top docs > collectors, eg. TotalHitCountCollector. And maybe also some of our users' > collectors. So maybe IndexSearcher could expose a generic way to take > advantage of the executor service? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution
[ https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340661#comment-14340661 ] Shikhar Bhushan commented on LUCENE-6294: - When slicing differnetly than segment-per-slice, it'd probably be desirable to distribute large segments by size across the slices, rather than all of them ending up in one slice to be searched sequentially. > Generalize how IndexSearcher parallelizes collection execution > -- > > Key: LUCENE-6294 > URL: https://issues.apache.org/jira/browse/LUCENE-6294 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Trivial > Fix For: Trunk, 5.1 > > Attachments: LUCENE-6294.patch > > > IndexSearcher takes an ExecutorService that can be used to parallelize > collection execution. This is useful if you want to trade throughput for > latency. > However, this executor service will only be used if you search for top docs. > In that case, we will create one collector per slide and call TopDocs.merge > in the end. If you use search(Query, Collector), the executor service will > never be used. > But there are other collectors that could work the same way as top docs > collectors, eg. TotalHitCountCollector. And maybe also some of our users' > collectors. So maybe IndexSearcher could expose a generic way to take > advantage of the executor service? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution
[ https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340661#comment-14340661 ] Shikhar Bhushan edited comment on LUCENE-6294 at 2/27/15 7:25 PM: -- When slicing differnetly than segment-per-slice, it'd probably be desirable to distribute segments by size across the slices, rather than all large segments ending up in one slice to be searched sequentially. was (Author: shikhar): When slicing differnetly than segment-per-slice, it'd probably be desirable to distribute large segments by size across the slices, rather than all of them ending up in one slice to be searched sequentially. > Generalize how IndexSearcher parallelizes collection execution > -- > > Key: LUCENE-6294 > URL: https://issues.apache.org/jira/browse/LUCENE-6294 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Trivial > Fix For: Trunk, 5.1 > > Attachments: LUCENE-6294.patch > > > IndexSearcher takes an ExecutorService that can be used to parallelize > collection execution. This is useful if you want to trade throughput for > latency. > However, this executor service will only be used if you search for top docs. > In that case, we will create one collector per slide and call TopDocs.merge > in the end. If you use search(Query, Collector), the executor service will > never be used. > But there are other collectors that could work the same way as top docs > collectors, eg. TotalHitCountCollector. And maybe also some of our users' > collectors. So maybe IndexSearcher could expose a generic way to take > advantage of the executor service? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution
[ https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340640#comment-14340640 ] Shikhar Bhushan commented on LUCENE-6294: - Makes sense! Seems to be already customizable by overriding that method. > Generalize how IndexSearcher parallelizes collection execution > -- > > Key: LUCENE-6294 > URL: https://issues.apache.org/jira/browse/LUCENE-6294 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Trivial > Fix For: Trunk, 5.1 > > Attachments: LUCENE-6294.patch > > > IndexSearcher takes an ExecutorService that can be used to parallelize > collection execution. This is useful if you want to trade throughput for > latency. > However, this executor service will only be used if you search for top docs. > In that case, we will create one collector per slide and call TopDocs.merge > in the end. If you use search(Query, Collector), the executor service will > never be used. > But there are other collectors that could work the same way as top docs > collectors, eg. TotalHitCountCollector. And maybe also some of our users' > collectors. So maybe IndexSearcher could expose a generic way to take > advantage of the executor service? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution
[ https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340544#comment-14340544 ] Shikhar Bhushan commented on LUCENE-6294: - This is great. I saw some improvements when testing LUCENE-5299 with the addition of a configurable parallelism throttle at the search request level using a semaphore, that might be useful to have here too. I.e. being able to cap how many segments are concurrently searched. That can help ensure resources for concurrent search requests, or reduce context switching if using an unbounded pool. > Generalize how IndexSearcher parallelizes collection execution > -- > > Key: LUCENE-6294 > URL: https://issues.apache.org/jira/browse/LUCENE-6294 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Trivial > Fix For: Trunk, 5.1 > > Attachments: LUCENE-6294.patch > > > IndexSearcher takes an ExecutorService that can be used to parallelize > collection execution. This is useful if you want to trade throughput for > latency. > However, this executor service will only be used if you search for top docs. > In that case, we will create one collector per slide and call TopDocs.merge > in the end. If you use search(Query, Collector), the executor service will > never be used. > But there are other collectors that could work the same way as top docs > collectors, eg. TotalHitCountCollector. And maybe also some of our users' > collectors. So maybe IndexSearcher could expose a generic way to take > advantage of the executor service? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340521#comment-14340521 ] Shikhar Bhushan commented on LUCENE-5299: - LUCENE-6294 is definitely a less intrusive approach. I think the tradeoff is that by moving the parallelization into the {{Collector}} API itself, we can make it composable and work for any arbitrary permutation of parallelizable collectors. > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Shikhar Bhushan > Fix For: Trunk, 5.1 > > Attachments: LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, > LUCENE-5299.patch, LUCENE-5299.patch, benchmarks.txt > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14213132#comment-14213132 ] Shikhar Bhushan commented on LUCENE-5299: - Slides from my talk at Lucene/Solr Revolution 2014 about this stuff - https://www.dropbox.com/s/h2nqsml0beed0pm/Search-time%20Parallelism.pdf Some backstory about the recent revival of this issue. The presentation was going to be a failure story since had not seen good performance on our test cluster when I tried it out last year. However after adding that request-level 'parallelism' throttle and possibly eliminating some bugs in cherry-picking onto latest trunk - seen consistently good results. You can see from the replay graphs towards the end p99 dropping by half, a few hundred ms better for p95, and median looks much improved too. CPU usage was more, as expected, but about similar (I think less, but don't have numbers) than the overhead we saw by sharding and running all the shards on localhost. We are still sharded in this manner so as you can see we considered the latency win to be worth it! > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Shikhar Bhushan > Attachments: LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, > LUCENE-5299.patch, LUCENE-5299.patch, benchmarks.txt > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185751#comment-14185751 ] Shikhar Bhushan edited comment on LUCENE-5299 at 10/27/14 9:01 PM: --- Just an update that the code rebased against recent trunk lives at https://github.com/shikhar/lucene-solr/tree/LUCENE-5299. I've made various tweaks, like being able to throttle per-request parallelism in {{ParallelSearchStrategy}}. luceneutil bench numbers when running with ^ & hacked IndexSearcher constructor that uses {{ParallelSearchStrategy(new ForkJoinPool(128), 8)}}, against trunk, on a 32 core (with HT) Sandy Bridge server, with source {{wikimedium500k}} SEARCH_NUM_THREADS = 16 {noformat} Report after iter 19: TaskQPS baseline StdDev QPS parcol StdDev Pct diff Fuzzy1 81.91 (43.2%) 52.96 (39.7%) -35.3% ( -82% - 83%) LowTerm 2550.11 (11.9%) 1927.28 (5.6%) -24.4% ( -37% - -7%) Respell 43.02 (39.4%) 35.23 (31.5%) -18.1% ( -63% - 87%) Fuzzy2 19.32 (25.1%) 16.40 (34.8%) -15.1% ( -59% - 59%) MedTerm 1679.37 (12.2%) 1743.27 (8.6%) 3.8% ( -15% - 28%) PKLookup 221.58 (8.3%) 257.36 (13.2%) 16.1% ( -4% - 41%) AndHighLow 1027.99 (11.6%) 1278.39 (15.9%) 24.4% ( -2% - 58%) AndHighMed 741.50 (10.0%) 1198.04 (27.5%) 61.6% ( 21% - 110%) MedPhrase 709.04 (11.6%) 1203.02 (24.3%) 69.7% ( 30% - 119%) LowSpanNear 601.13 (16.9%) 1127.30 (16.7%) 87.5% ( 46% - 145%) LowSloppyPhrase 554.87 (10.8%) 1130.25 (30.5%) 103.7% ( 56% - 162%) OrHighMed 408.55 (10.4%) 977.56 (20.1%) 139.3% ( 98% - 189%) LowPhrase 364.36 (10.8%) 893.27 (41.0%) 145.2% ( 84% - 220%) OrHighLow 355.78 (12.7%) 893.63 (19.6%) 151.2% ( 105% - 210%) AndHighHigh 390.73 (10.3%) 1004.70 (24.3%) 157.1% ( 111% - 213%) HighTerm 399.01 (11.8%) 1067.67 (12.1%) 167.6% ( 128% - 217%) Wildcard 754.76 (11.6%) 2067.96 (28.0%) 174.0% ( 120% - 241%) HighSpanNear 153.57 (14.8%) 463.54 (24.3%) 201.8% ( 141% - 282%) OrHighHigh 212.16 (12.4%) 665.56 (28.2%) 213.7% ( 154% - 290%) HighPhrase 170.49 (13.1%) 547.72 (17.3%) 221.3% ( 168% - 289%) HighSloppyPhrase 66.91 (10.1%) 219.59 (12.0%) 228.2% ( 187% - 278%) MedSloppyPhrase 128.73 (12.5%) 425.67 (20.3%) 230.7% ( 175% - 300%) MedSpanNear 130.31 (10.7%) 436.12 (18.2%) 234.7% ( 185% - 295%) Prefix3 166.91 (14.9%) 652.64 (26.7%) 291.0% ( 217% - 390%) IntNRQ 110.73 (15.0%) 467.72 (33.6%) 322.4% ( 238% - 436%) {noformat} SEARCH_NUM_THREADS=32 {noformat} TaskQPS baseline StdDev QPS parcol StdDev Pct diff LowTerm 2401.88 (12.7%) 1799.27 (6.3%) -25.1% ( -39% - -6%) Fuzzy26.52 (14.4%)5.74 (24.0%) -11.9% ( -43% - 30%) Respell 45.13 (90.2%) 40.94 (83.5%) -9.3% ( -96% - 1679%) PKLookup 232.02 (12.9%) 228.35 (12.4%) -1.6% ( -23% - 27%) MedTerm 1612.01 (14.0%) 1601.71 (10.9%) -0.6% ( -22% - 28%) Fuzzy1 14.19 (79.3%) 14.71(177.6%) 3.7% (-141% - 1258%) AndHighLow 1205.65 (17.5%) 1254.76 (15.9%) 4.1% ( -24% - 45%) MedSpanNear 478.11 (25.4%) 946.72 (34.5%) 98.0% ( 30% - 211%) OrHighLow 424.71 (14.5%) 941.39 (31.4%) 121.7% ( 66% - 195%) AndHighHigh 377.82 (13.3%) 910.77 (32.2%) 141.1% ( 84% - 215%) HighTerm 325.35 (11.3%) 855.63 (8.9%) 163.0% ( 128% - 206%) AndHighMed 346.57 (11.7%) 914.59 (26.4%) 163.9% ( 112% - 228%) MedPhrase 227.47 (13.1%) 621.50 (22.9%) 173.2% ( 121% - 240%) LowSloppyPhrase 265.21 (10.4%) 748.30 (49.2%) 182.2% ( 110% - 269%) OrHighMed 221.49 (12.2%) 632.55 (23.9%)
[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185751#comment-14185751 ] Shikhar Bhushan commented on LUCENE-5299: - Just an update that the code rebased against recent trunk lives at https://github.com/shikhar/lucene-solr/tree/LUCENE-5299. I've made various tweaks, like being able to throttle per-request parallelism in {{ParallelSearchStrategy}}. luceneutil bench numbers when running with ^ + hacked IndexSearcher constructor that uses {{ParallelSearchStrategy(new ForkJoinPool(128), 8)}} + luceneutil constants.py SEARCH_NUM_THREADS = 16 Against trunk, on a 32 core (with HT) Sandy Bridge server, with source {{wikimedium500k}} {noformat} Report after iter 19: TaskQPS baseline StdDev QPS parcol StdDev Pct diff Fuzzy1 81.91 (43.2%) 52.96 (39.7%) -35.3% ( -82% - 83%) LowTerm 2550.11 (11.9%) 1927.28 (5.6%) -24.4% ( -37% - -7%) Respell 43.02 (39.4%) 35.23 (31.5%) -18.1% ( -63% - 87%) Fuzzy2 19.32 (25.1%) 16.40 (34.8%) -15.1% ( -59% - 59%) MedTerm 1679.37 (12.2%) 1743.27 (8.6%) 3.8% ( -15% - 28%) PKLookup 221.58 (8.3%) 257.36 (13.2%) 16.1% ( -4% - 41%) AndHighLow 1027.99 (11.6%) 1278.39 (15.9%) 24.4% ( -2% - 58%) AndHighMed 741.50 (10.0%) 1198.04 (27.5%) 61.6% ( 21% - 110%) MedPhrase 709.04 (11.6%) 1203.02 (24.3%) 69.7% ( 30% - 119%) LowSpanNear 601.13 (16.9%) 1127.30 (16.7%) 87.5% ( 46% - 145%) LowSloppyPhrase 554.87 (10.8%) 1130.25 (30.5%) 103.7% ( 56% - 162%) OrHighMed 408.55 (10.4%) 977.56 (20.1%) 139.3% ( 98% - 189%) LowPhrase 364.36 (10.8%) 893.27 (41.0%) 145.2% ( 84% - 220%) OrHighLow 355.78 (12.7%) 893.63 (19.6%) 151.2% ( 105% - 210%) AndHighHigh 390.73 (10.3%) 1004.70 (24.3%) 157.1% ( 111% - 213%) HighTerm 399.01 (11.8%) 1067.67 (12.1%) 167.6% ( 128% - 217%) Wildcard 754.76 (11.6%) 2067.96 (28.0%) 174.0% ( 120% - 241%) HighSpanNear 153.57 (14.8%) 463.54 (24.3%) 201.8% ( 141% - 282%) OrHighHigh 212.16 (12.4%) 665.56 (28.2%) 213.7% ( 154% - 290%) HighPhrase 170.49 (13.1%) 547.72 (17.3%) 221.3% ( 168% - 289%) HighSloppyPhrase 66.91 (10.1%) 219.59 (12.0%) 228.2% ( 187% - 278%) MedSloppyPhrase 128.73 (12.5%) 425.67 (20.3%) 230.7% ( 175% - 300%) MedSpanNear 130.31 (10.7%) 436.12 (18.2%) 234.7% ( 185% - 295%) Prefix3 166.91 (14.9%) 652.64 (26.7%) 291.0% ( 217% - 390%) IntNRQ 110.73 (15.0%) 467.72 (33.6%) 322.4% ( 238% - 436%) {noformat} > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Shikhar Bhushan > Attachments: LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, > LUCENE-5299.patch, LUCENE-5299.patch, benchmarks.txt > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > spec
[jira] [Commented] (SOLR-6105) DebugComponent NPE when single-pass distributed search is used
[ https://issues.apache.org/jira/browse/SOLR-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004799#comment-14004799 ] Shikhar Bhushan commented on SOLR-6105: --- also paging [~vzhovtiuk] - presumably you're using this feature in your app. does debugQuery=true work ok for you? > DebugComponent NPE when single-pass distributed search is used > -- > > Key: SOLR-6105 > URL: https://issues.apache.org/jira/browse/SOLR-6105 > Project: Solr > Issue Type: Bug >Reporter: Shikhar Bhushan >Priority: Minor > > I'm seeing NPE's in {{DebugComponent}} with debugQuery=true when just ID & > score are requested, which enables the single-pass distributed search > optimization from SOLR-1880. > The NPE originates on this line in DebugComponent.finishStage(): > {noformat} > int idx = sdoc.positionInResponse; > {noformat} > indicating an ID that is in the explain but missing in the resultIds. > I'm afraid I haven't been able to reproduce this in > {{DistributedQueryComponentOptimizationTest}}, but wanted to open this ticket > in any case. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6105) DebugComponent NPE when single-pass distributed search is used
Shikhar Bhushan created SOLR-6105: - Summary: DebugComponent NPE when single-pass distributed search is used Key: SOLR-6105 URL: https://issues.apache.org/jira/browse/SOLR-6105 Project: Solr Issue Type: Bug Reporter: Shikhar Bhushan Priority: Minor I'm seeing NPE's in {{DebugComponent}} with debugQuery=true when just ID & score are requested, which enables the single-pass distributed search optimization from SOLR-1880. The NPE originates on this line in DebugComponent.finishStage(): {noformat} int idx = sdoc.positionInResponse; {noformat} indicating an ID that is in the explain but missing in the resultIds. I'm afraid I haven't been able to reproduce this in {{DistributedQueryComponentOptimizationTest}}, but wanted to open this ticket in any case. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6105) DebugComponent NPE when single-pass distributed search is used
[ https://issues.apache.org/jira/browse/SOLR-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004796#comment-14004796 ] Shikhar Bhushan commented on SOLR-6105: --- paging [~shalinmangar] in case you have any idea what might be going on > DebugComponent NPE when single-pass distributed search is used > -- > > Key: SOLR-6105 > URL: https://issues.apache.org/jira/browse/SOLR-6105 > Project: Solr > Issue Type: Bug > Reporter: Shikhar Bhushan >Priority: Minor > > I'm seeing NPE's in {{DebugComponent}} with debugQuery=true when just ID & > score are requested, which enables the single-pass distributed search > optimization from SOLR-1880. > The NPE originates on this line in DebugComponent.finishStage(): > {noformat} > int idx = sdoc.positionInResponse; > {noformat} > indicating an ID that is in the explain but missing in the resultIds. > I'm afraid I haven't been able to reproduce this in > {{DistributedQueryComponentOptimizationTest}}, but wanted to open this ticket > in any case. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-5648) SolrCore#getStatistics() should nest open searchers' stats
[ https://issues.apache.org/jira/browse/SOLR-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan closed SOLR-5648. - Resolution: Invalid bq. 1) I'm not sure i really understand what this adds – isn't every registered searcher (which should include every open searcher if there are more then one) already listed in the infoRegistry (so it's stats are surfaced in /admin/mbeans and via JMX) ? you're right! that's much better. > SolrCore#getStatistics() should nest open searchers' stats > -- > > Key: SOLR-5648 > URL: https://issues.apache.org/jira/browse/SOLR-5648 > Project: Solr > Issue Type: Task >Reporter: Shikhar Bhushan >Priority: Minor > Fix For: 4.9, 5.0 > > Attachments: SOLR-5648.patch, oldestSearcherStaleness.gif, > openSearchers.gif > > > {{SolrIndexSearcher}} leaks are a notable cause of garbage collection issues > in codebases with custom components. > So it is useful to be able to access monitoring information about what > searchers are currently open, and in turn access their stats e.g. > {{openedAt}}. > This can be nested via {{SolrCore#getStatistics()}} which has a > {{_searchers}} collection of all open searchers. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected
[ https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999732#comment-13999732 ] Shikhar Bhushan commented on LUCENE-4370: - Been thinking about the semantics of these done callbacks not being invoked in case of exceptions which was a concern raised by [~jpountz] in LUCENE-5527, this seems to be not very helpful when e.g. you have a TimeExceededException or EarlyTerminatingCollectorException thrown and you need to maybe merge in some state into the parent collector in {{LeafCollector.leafDone()}}, or perhaps finalize results in {{Collector.done()}}. Maybe we need a special kind of exception, just like CollectionTerminatedException. The semantics for CollectionTerminatedException are currently that collection continues with the next leaf. So some new base-class for the "rethrow me but invoke done callbacks" case? In case of any other kinds of exception like IOException, I don't think we should be invoking done() callbacks because the collector's results should not be expected to be usable. > Let Collector know when all docs have been collected > > > Key: LUCENE-4370 > URL: https://issues.apache.org/jira/browse/LUCENE-4370 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 4.0-BETA >Reporter: Tomás Fernández Löbbe >Priority: Minor > Attachments: LUCENE-4370.patch, LUCENE-4370.patch > > > Collectors are a good point for extension/customization of Lucene/Solr, > however sometimes it's necessary to know when the last document has been > collected (for example, for flushing cached data). > It would be nice to have a method that gets called after the last doc has > been collected. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4370) Let Collector know when all docs have been collected
[ https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-4370: Attachment: (was: LUCENE-4370.patch) > Let Collector know when all docs have been collected > > > Key: LUCENE-4370 > URL: https://issues.apache.org/jira/browse/LUCENE-4370 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 4.0-BETA >Reporter: Tomás Fernández Löbbe >Priority: Minor > > Collectors are a good point for extension/customization of Lucene/Solr, > however sometimes it's necessary to know when the last document has been > collected (for example, for flushing cached data). > It would be nice to have a method that gets called after the last doc has > been collected. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected
[ https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997666#comment-13997666 ] Shikhar Bhushan commented on LUCENE-4370: - > On one hand I think a Collector.finish() would be nice, but the argument > could be made you could handle this yourself (its done with > IndexSearcher.search returns). Such a technique does not compose easily e.g. when you want to wrap collectors in other collectors, unless you customize each and every one in the chain. > Let Collector know when all docs have been collected > > > Key: LUCENE-4370 > URL: https://issues.apache.org/jira/browse/LUCENE-4370 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 4.0-BETA >Reporter: Tomás Fernández Löbbe >Priority: Minor > Attachments: LUCENE-4370.patch, LUCENE-4370.patch > > > Collectors are a good point for extension/customization of Lucene/Solr, > however sometimes it's necessary to know when the last document has been > collected (for example, for flushing cached data). > It would be nice to have a method that gets called after the last doc has > been collected. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4370) Let Collector know when all docs have been collected
[ https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-4370: Attachment: LUCENE-4370.patch attaching another version which adds a callback on both Collector {{void done();}} as well as on LeafCollector {{void leafDone();}} > Let Collector know when all docs have been collected > > > Key: LUCENE-4370 > URL: https://issues.apache.org/jira/browse/LUCENE-4370 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 4.0-BETA >Reporter: Tomás Fernández Löbbe >Priority: Minor > Attachments: LUCENE-4370.patch, LUCENE-4370.patch > > > Collectors are a good point for extension/customization of Lucene/Solr, > however sometimes it's necessary to know when the last document has been > collected (for example, for flushing cached data). > It would be nice to have a method that gets called after the last doc has > been collected. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected
[ https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995070#comment-13995070 ] Shikhar Bhushan commented on LUCENE-4370: - Umm, I totally forgot about the callers. Updated patch coming. > Let Collector know when all docs have been collected > > > Key: LUCENE-4370 > URL: https://issues.apache.org/jira/browse/LUCENE-4370 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 4.0-BETA >Reporter: Tomás Fernández Löbbe >Priority: Minor > Attachments: LUCENE-4370.patch, LUCENE-4370.patch > > > Collectors are a good point for extension/customization of Lucene/Solr, > however sometimes it's necessary to know when the last document has been > collected (for example, for flushing cached data). > It would be nice to have a method that gets called after the last doc has > been collected. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4370) Let Collector know when all docs have been collected
[ https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-4370: Attachment: LUCENE-4370.patch Attaching patch. I updated callers based on auditing usages of {{Collector.getLeafCollector(..)}} > Let Collector know when all docs have been collected > > > Key: LUCENE-4370 > URL: https://issues.apache.org/jira/browse/LUCENE-4370 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 4.0-BETA >Reporter: Tomás Fernández Löbbe >Priority: Minor > Attachments: LUCENE-4370.patch > > > Collectors are a good point for extension/customization of Lucene/Solr, > however sometimes it's necessary to know when the last document has been > collected (for example, for flushing cached data). > It would be nice to have a method that gets called after the last doc has > been collected. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4370) Let Collector know when all docs have been collected
[ https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-4370: Attachment: (was: LUCENE-4370.patch) > Let Collector know when all docs have been collected > > > Key: LUCENE-4370 > URL: https://issues.apache.org/jira/browse/LUCENE-4370 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 4.0-BETA >Reporter: Tomás Fernández Löbbe >Priority: Minor > > Collectors are a good point for extension/customization of Lucene/Solr, > however sometimes it's necessary to know when the last document has been > collected (for example, for flushing cached data). > It would be nice to have a method that gets called after the last doc has > been collected. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4370) Let Collector know when all docs have been collected
[ https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-4370: Attachment: LUCENE-4370.patch > Let Collector know when all docs have been collected > > > Key: LUCENE-4370 > URL: https://issues.apache.org/jira/browse/LUCENE-4370 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 4.0-BETA >Reporter: Tomás Fernández Löbbe >Priority: Minor > Attachments: LUCENE-4370.patch, LUCENE-4370.patch > > > Collectors are a good point for extension/customization of Lucene/Solr, > however sometimes it's necessary to know when the last document has been > collected (for example, for flushing cached data). > It would be nice to have a method that gets called after the last doc has > been collected. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4370) Let Collector know when all docs have been collected
[ https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-4370: Attachment: LUCENE-4370.patch Attaching patch. I think there is a huge potential for cleanups if this goes in, I'm happy to work on some of that. > Let Collector know when all docs have been collected > > > Key: LUCENE-4370 > URL: https://issues.apache.org/jira/browse/LUCENE-4370 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 4.0-BETA >Reporter: Tomás Fernández Löbbe >Priority: Minor > Attachments: LUCENE-4370.patch, LUCENE-4370.patch > > > Collectors are a good point for extension/customization of Lucene/Solr, > however sometimes it's necessary to know when the last document has been > collected (for example, for flushing cached data). > It would be nice to have a method that gets called after the last doc has > been collected. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Issue with functions that require metadata, and LeafCollectors
On Fri, May 2, 2014 at 3:06 AM, Chris Russell < chris.russ...@careerbuilder.com> wrote: > I need the LeafCollectors in the first loop where I am making the scorers > because LeafCollector now has the acceptDocsOutOfOrder method. > > I wonder if the answer here is that acceptsDocsOutOfOrder() should live on the Collector rather than the LeafCollector. Are there cases where that does not make sense?
[jira] [Comment Edited] (LUCENE-4370) Let Collector know when all docs have been collected
[ https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961394#comment-13961394 ] Shikhar Bhushan edited comment on LUCENE-4370 at 4/6/14 12:45 PM: -- Proposal: {{LeafCollector}} (in trunk via LUCENE-5527) gets a method {{void done();} Semantics: It is invoked when collection with that leaf has completed. It is not invoked if collection terminates due to an exception. I know this ticket was originally about having such a method on {{Collector}} and not at the segment-level collection, however I think all use cases can be cleanly modelled in this manner. As naming goes, I think {{done()}} or such is better than {{close()}}, which implies a try-finally'esque construct. Edit: changed my proposal from {{finish()}} to {{done()}} to avoid messing with existing uses e.g. {{DelegatingCollector}} which would currently extend {{SimpleCollector}} that implements both {{Collector}} and {{LeafCollector}}. /cc [~jpountz] [~rcmuir] [~hossman] was (Author: shikhar): Proposal: {{LeafCollector}} (in trunk via LUCENE-5527) gets a method {{void done();} Semantics: It is invoked when collection with that leaf has completed. It is not invoked if collection does terminates due to an exception. I know this ticket was originally about having such a method on {{Collector}} and not at the segment-level collection, however I think all use cases can be cleanly modelled in this manner. As naming goes, I think {{done()}} or such is better than {{close()}}, which implies a try-finally'esque construct. Edit: changed my proposal from {{finish()}} to {{done()}} to avoid messing with existing uses e.g. {{DelegatingCollector}} which would currently extend {{SimpleCollector}} that implements both {{Collector}} and {{LeafCollector}}. /cc [~jpountz] [~rcmuir] [~hossman] > Let Collector know when all docs have been collected > > > Key: LUCENE-4370 > URL: https://issues.apache.org/jira/browse/LUCENE-4370 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 4.0-BETA >Reporter: Tomás Fernández Löbbe >Priority: Minor > > Collectors are a good point for extension/customization of Lucene/Solr, > however sometimes it's necessary to know when the last document has been > collected (for example, for flushing cached data). > It would be nice to have a method that gets called after the last doc has > been collected. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4370) Let Collector know when all docs have been collected
[ https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961394#comment-13961394 ] Shikhar Bhushan edited comment on LUCENE-4370 at 4/6/14 12:44 PM: -- Proposal: {{LeafCollector}} (in trunk via LUCENE-5527) gets a method {{void done();} Semantics: It is invoked when collection with that leaf has completed. It is not invoked if collection does terminates due to an exception. I know this ticket was originally about having such a method on {{Collector}} and not at the segment-level collection, however I think all use cases can be cleanly modelled in this manner. As naming goes, I think {{done()}} or such is better than {{close()}}, which implies a try-finally'esque construct. Edit: changed my proposal from {{finish()}} to {{done()}} to avoid messing with existing uses e.g. {{DelegatingCollector}} which would currently extend {{SimpleCollector}} that implements both {{Collector}} and {{LeafCollector}}. /cc [~jpountz] [~rcmuir] [~hossman] was (Author: shikhar): Proposal: {{LeafCollector}} (in trunk via LUCENE-5527) gets a method {{void finish();} Semantics: It is invoked when collection with that leaf has completed. It is not invoked if collection does terminates due to an exception. I know this ticket was originally about having such a method on {{Collector}} and not at the segment-level collection, however I think all use cases can be cleanly modelled in this manner. As naming goes, I think {{finish()}} or {{done()}} or such is better than {{close()}}, which implies a try-finally'esque construct. /cc [~jpountz] [~rcmuir] [~hossman] > Let Collector know when all docs have been collected > > > Key: LUCENE-4370 > URL: https://issues.apache.org/jira/browse/LUCENE-4370 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 4.0-BETA >Reporter: Tomás Fernández Löbbe >Priority: Minor > > Collectors are a good point for extension/customization of Lucene/Solr, > however sometimes it's necessary to know when the last document has been > collected (for example, for flushing cached data). > It would be nice to have a method that gets called after the last doc has > been collected. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected
[ https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961394#comment-13961394 ] Shikhar Bhushan commented on LUCENE-4370: - Proposal: {{LeafCollector}} (in trunk via LUCENE-5527) gets a method {{void finish();} Semantics: It is invoked when collection with that leaf has completed. It is not invoked if collection does terminates due to an exception. I know this ticket was originally about having such a method on {{Collector}} and not at the segment-level collection, however I think all use cases can be cleanly modelled in this manner. As naming goes, I think {{finish()}} or {{done()}} or such is better than {{close()}}, which implies a try-finally'esque construct. /cc [~jpountz] [~rcmuir] [~hossman] > Let Collector know when all docs have been collected > > > Key: LUCENE-4370 > URL: https://issues.apache.org/jira/browse/LUCENE-4370 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 4.0-BETA >Reporter: Tomás Fernández Löbbe >Priority: Minor > > Collectors are a good point for extension/customization of Lucene/Solr, > however sometimes it's necessary to know when the last document has been > collected (for example, for flushing cached data). > It would be nice to have a method that gets called after the last doc has > been collected. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5527) Make the Collector API work per-segment
[ https://issues.apache.org/jira/browse/LUCENE-5527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13959101#comment-13959101 ] Shikhar Bhushan commented on LUCENE-5527: - Thanks for picking this up Adrien! I always wanted to push forward at least the API refactoring but did not get the chance to do so. +1 on adding a method like LeafCollector.done() / finish() or such, and making that part of the usage contract. It's not just Solr with DelegatingCollector that has something like this, I think I remember seeing this pattern even in ES. LUCENE-5299 had this as a SubCollector.done() method and it led to a lot of code-cleanup at various places where we were trying to detect a transition to the next segment based on a call to setNextReader(). In some cases, the result finalization was being done lazily when result retrieval methods are being called, because there is no other good way of knowing that the last segment has been processed. > Make the Collector API work per-segment > --- > > Key: LUCENE-5527 > URL: https://issues.apache.org/jira/browse/LUCENE-5527 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Fix For: 5.0 > > Attachments: LUCENE-5527.patch > > > Spin-off of LUCENE-5299. > LUCENE-5229 proposes different changes, some of them being controversial, but > there is one of them that I really really like that consists in refactoring > the {{Collector}} API in order to have a different Collector per segment. > The idea is, instead of having a single Collector object that needs to be > able to take care of all segments, to have a top-level Collector: > {code} > public interface Collector { > AtomicCollector setNextReader(AtomicReaderContext context) throws > IOException; > > } > {code} > and a per-AtomicReaderContext collector: > {code} > public interface AtomicCollector { > void setScorer(Scorer scorer) throws IOException; > void collect(int doc) throws IOException; > boolean acceptsDocsOutOfOrder(); > } > {code} > I think it makes the API clearer since it is now obious {{setScorer}} and > {{acceptDocsOutOfOrder}} need to be called after {{setNextReader}} which is > otherwise unclear. > It also makes things more flexible. For example, a collector could much more > easily decide to use different strategies on different segments. In > particular, it makes the early-termination collector much cleaner since it > can return different atomic collectors implementations depending on whether > the current segment is sorted or not. > Even if we have lots of collectors all over the place, we could make it > easier to migrate by having a Collector that would implement both Collector > and AtomicCollector, return {{this}} in setNextReader and make current > concrete Collector implementations extend this class instead of directly > extending Collector. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5768) Add a distrib.singlePass parameter to make GET_FIELDS phase fetch all fields and skip EXECUTE_QUERY
[ https://issues.apache.org/jira/browse/SOLR-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13922669#comment-13922669 ] Shikhar Bhushan commented on SOLR-5768: --- seems like the JIRA title has it the other way round :) > Add a distrib.singlePass parameter to make GET_FIELDS phase fetch all fields > and skip EXECUTE_QUERY > --- > > Key: SOLR-5768 > URL: https://issues.apache.org/jira/browse/SOLR-5768 > Project: Solr > Issue Type: Improvement >Reporter: Shalin Shekhar Mangar >Priority: Minor > Fix For: 4.8, 5.0 > > > Suggested by Yonik on solr-user: > http://www.mail-archive.com/solr-user@lucene.apache.org/msg95045.html > {quote} > Although it seems like it should be relatively simple to make it work > with other fields as well, by passing down the complete "fl" requested > if some optional parameter is set (distrib.singlePass?) > {quote} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5637) Per-request cache statistics
[ https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated SOLR-5637: -- Attachment: SOLR-5637.patch updated patch against lucene_solr_4_7 branch > Per-request cache statistics > > > Key: SOLR-5637 > URL: https://issues.apache.org/jira/browse/SOLR-5637 > Project: Solr > Issue Type: New Feature > Reporter: Shikhar Bhushan >Priority: Minor > Attachments: SOLR-5367.patch, SOLR-5367.patch, SOLR-5637.patch > > > We have found it very useful to have information on the number of cache hits > and misses for key Solr caches (filterCache, documentCache, etc.) at the > request level. > This is currently implemented in our codebase using custom {{SolrCache}} > implementations. > I am working on moving to maintaining stats in the {{SolrRequestInfo}} > thread-local, and adding hooks in get() methods of SolrCache implementations. > This will be glued up using the {{DebugComponent}} and can be requested using > a "debug.cache" parameter. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5637) Per-request cache statistics
[ https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated SOLR-5637: -- Fix Version/s: (was: 4.7) > Per-request cache statistics > > > Key: SOLR-5637 > URL: https://issues.apache.org/jira/browse/SOLR-5637 > Project: Solr > Issue Type: New Feature > Reporter: Shikhar Bhushan >Priority: Minor > Attachments: SOLR-5367.patch, SOLR-5367.patch > > > We have found it very useful to have information on the number of cache hits > and misses for key Solr caches (filterCache, documentCache, etc.) at the > request level. > This is currently implemented in our codebase using custom {{SolrCache}} > implementations. > I am working on moving to maintaining stats in the {{SolrRequestInfo}} > thread-local, and adding hooks in get() methods of SolrCache implementations. > This will be glued up using the {{DebugComponent}} and can be requested using > a "debug.cache" parameter. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: maven build issues with non-numeric custom version
On Tue, Jan 28, 2014 at 7:28 PM, shikhar wrote: > I have run into this as well. It'd be great to allow arbitary strings for > versioning of custom releases. > Specifically, I was trying to use git sha's.
Re: maven build issues with non-numeric custom version
I have run into this as well. It'd be great to allow arbitary strings for versioning of custom releases. This is the culprit: https://github.com/apache/lucene-solr/blob/branch_4x/lucene/tools/src/java/org/apache/lucene/dependencies/GetMavenDependenciesTask.java#L627 On Tue, Jan 28, 2014 at 4:10 AM, Ryan McKinley wrote: > From: > > http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/dev-tools/maven/README.maven > > It says we can get a custom build number using: > > ant -Dversion=my-special-version get-maven-poms > > > but this fails with: > > BUILD FAILED > > /Users/ryan/workspace/apache/lucene_4x/build.xml:141: The following error > occurred while executing this line: > > /Users/ryan/workspace/apache/lucene_4x/lucene/common-build.xml:1578: The > following error occurred while executing this line: > > /Users/ryan/workspace/apache/lucene_4x/lucene/tools/custom-tasks.xml:122: > Malformed module dependency from > 'lucene-analyzers-phonetic.internal.test.dependencies': > 'lucene/build/analysis/common/lucene-analyzers-common-my-special-version.jar' > > > > Using a numeric version number things work OK. > > > Any ideas? > > > ryan > > >
[jira] [Commented] (SOLR-5648) SolrCore#getStatistics() should nest open searchers' stats
[ https://issues.apache.org/jira/browse/SOLR-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13880795#comment-13880795 ] Shikhar Bhushan commented on SOLR-5648: --- [~otis] yup > SolrCore#getStatistics() should nest open searchers' stats > -- > > Key: SOLR-5648 > URL: https://issues.apache.org/jira/browse/SOLR-5648 > Project: Solr > Issue Type: Task >Reporter: Shikhar Bhushan >Priority: Minor > Fix For: 4.7 > > Attachments: SOLR-5648.patch, oldestSearcherStaleness.gif, > openSearchers.gif > > > {{SolrIndexSearcher}} leaks are a notable cause of garbage collection issues > in codebases with custom components. > So it is useful to be able to access monitoring information about what > searchers are currently open, and in turn access their stats e.g. > {{openedAt}}. > This can be nested via {{SolrCore#getStatistics()}} which has a > {{_searchers}} collection of all open searchers. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup
[ https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated SOLR-5505: -- Fix Version/s: 4.7 > LoggingInfoStream not usabe in a multi-core setup > - > > Key: SOLR-5505 > URL: https://issues.apache.org/jira/browse/SOLR-5505 > Project: Solr > Issue Type: Bug >Affects Versions: 4.6 > Reporter: Shikhar Bhushan > Fix For: 4.7 > > Attachments: SOLR-5505.patch, SOLR-5505.patch > > > {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core > context. > Previously this was possible by encoding this into the infoStream's file path. > This means in a multi-core setup it is very hard to distinguish between the > infoStream messages for different cores. > {{LoggingInfoStream}} should be automatically configured to prepend the core > name to log messages. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5637) Per-request cache statistics
[ https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated SOLR-5637: -- Fix Version/s: 4.7 > Per-request cache statistics > > > Key: SOLR-5637 > URL: https://issues.apache.org/jira/browse/SOLR-5637 > Project: Solr > Issue Type: New Feature > Reporter: Shikhar Bhushan >Priority: Minor > Fix For: 4.7 > > Attachments: SOLR-5367.patch, SOLR-5367.patch > > > We have found it very useful to have information on the number of cache hits > and misses for key Solr caches (filterCache, documentCache, etc.) at the > request level. > This is currently implemented in our codebase using custom {{SolrCache}} > implementations. > I am working on moving to maintaining stats in the {{SolrRequestInfo}} > thread-local, and adding hooks in get() methods of SolrCache implementations. > This will be glued up using the {{DebugComponent}} and can be requested using > a "debug.cache" parameter. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5648) SolrCore#getStatistics() should nest open searchers' stats
[ https://issues.apache.org/jira/browse/SOLR-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated SOLR-5648: -- Fix Version/s: 4.7 > SolrCore#getStatistics() should nest open searchers' stats > -- > > Key: SOLR-5648 > URL: https://issues.apache.org/jira/browse/SOLR-5648 > Project: Solr > Issue Type: Task >Reporter: Shikhar Bhushan >Priority: Minor > Fix For: 4.7 > > Attachments: SOLR-5648.patch, oldestSearcherStaleness.gif, > openSearchers.gif > > > {{SolrIndexSearcher}} leaks are a notable cause of garbage collection issues > in codebases with custom components. > So it is useful to be able to access monitoring information about what > searchers are currently open, and in turn access their stats e.g. > {{openedAt}}. > This can be nested via {{SolrCore#getStatistics()}} which has a > {{_searchers}} collection of all open searchers. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5648) SolrCore#getStatistics() should nest open searchers' stats
[ https://issues.apache.org/jira/browse/SOLR-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated SOLR-5648: -- Attachment: SOLR-5648.patch Patch attached. Note that the {{_searchers}} access is synchronized on {{searcherLock}} as per the usage pattern established in the class. It does not seem like that lock is held for too long wherever it is used, so this should be ok. > SolrCore#getStatistics() should nest open searchers' stats > -- > > Key: SOLR-5648 > URL: https://issues.apache.org/jira/browse/SOLR-5648 > Project: Solr > Issue Type: Task >Reporter: Shikhar Bhushan >Priority: Minor > Attachments: SOLR-5648.patch > > > {{SolrIndexSearcher}} leaks are a notable cause of garbage collection issues > in codebases with custom components. > So it is useful to be able to access monitoring information about what > searchers are currently open, and in turn access their stats e.g. > {{openedAt}}. > This can be nested via {{SolrCore#getStatistics()}} which has a > {{_searchers}} collection of all open searchers. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5648) SolrCore#getStatistics() should nest open searchers' stats
Shikhar Bhushan created SOLR-5648: - Summary: SolrCore#getStatistics() should nest open searchers' stats Key: SOLR-5648 URL: https://issues.apache.org/jira/browse/SOLR-5648 Project: Solr Issue Type: Task Reporter: Shikhar Bhushan Priority: Minor {{SolrIndexSearcher}} leaks are cause of garbage collection issues in codebases with custom components. So it is useful to be able to access monitoring information about what searchers are currently open, and in turn access their stats e.g. {{openedAt}}. This can be nested via {{SolrCore#getStatistics()}} which has a {{_searchers}} collection of all open searchers. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5648) SolrCore#getStatistics() should nest open searchers' stats
[ https://issues.apache.org/jira/browse/SOLR-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated SOLR-5648: -- Description: {{SolrIndexSearcher}} leaks are a notable cause of garbage collection issues in codebases with custom components. So it is useful to be able to access monitoring information about what searchers are currently open, and in turn access their stats e.g. {{openedAt}}. This can be nested via {{SolrCore#getStatistics()}} which has a {{_searchers}} collection of all open searchers. was: {{SolrIndexSearcher}} leaks are cause of garbage collection issues in codebases with custom components. So it is useful to be able to access monitoring information about what searchers are currently open, and in turn access their stats e.g. {{openedAt}}. This can be nested via {{SolrCore#getStatistics()}} which has a {{_searchers}} collection of all open searchers. > SolrCore#getStatistics() should nest open searchers' stats > -- > > Key: SOLR-5648 > URL: https://issues.apache.org/jira/browse/SOLR-5648 > Project: Solr > Issue Type: Task >Reporter: Shikhar Bhushan >Priority: Minor > > {{SolrIndexSearcher}} leaks are a notable cause of garbage collection issues > in codebases with custom components. > So it is useful to be able to access monitoring information about what > searchers are currently open, and in turn access their stats e.g. > {{openedAt}}. > This can be nested via {{SolrCore#getStatistics()}} which has a > {{_searchers}} collection of all open searchers. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5637) Per-request cache statistics
[ https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876801#comment-13876801 ] Shikhar Bhushan commented on SOLR-5637: --- For caches where this instrumentation is not desirable, it can be opted out via the XML init arg "perRequestStats" for the SolrCache (takes boolean values "true" / "false"). It currently defaults to true. > Per-request cache statistics > > > Key: SOLR-5637 > URL: https://issues.apache.org/jira/browse/SOLR-5637 > Project: Solr > Issue Type: New Feature >Reporter: Shikhar Bhushan >Priority: Minor > Attachments: SOLR-5367.patch, SOLR-5367.patch > > > We have found it very useful to have information on the number of cache hits > and misses for key Solr caches (filterCache, documentCache, etc.) at the > request level. > This is currently implemented in our codebase using custom {{SolrCache}} > implementations. > I am working on moving to maintaining stats in the {{SolrRequestInfo}} > thread-local, and adding hooks in get() methods of SolrCache implementations. > This will be glued up using the {{DebugComponent}} and can be requested using > a "debug.cache" parameter. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5637) Per-request cache statistics
[ https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876798#comment-13876798 ] Shikhar Bhushan edited comment on SOLR-5637 at 1/20/14 7:55 PM: Works in the distrib case now, though end up getting aggregate numbers out via {{DebugComponent#merge()}} -- an enhancement might be to make the stats be part of the 'track' response from shards. was (Author: shikhar): Works in the distrib case now, though end up getting aggregate numbers out via {{DebugComponent#merge()} -- an enhancement might be to make the stats be part of the 'track' response from shards. > Per-request cache statistics > > > Key: SOLR-5637 > URL: https://issues.apache.org/jira/browse/SOLR-5637 > Project: Solr > Issue Type: New Feature >Reporter: Shikhar Bhushan >Priority: Minor > Attachments: SOLR-5367.patch, SOLR-5367.patch > > > We have found it very useful to have information on the number of cache hits > and misses for key Solr caches (filterCache, documentCache, etc.) at the > request level. > This is currently implemented in our codebase using custom {{SolrCache}} > implementations. > I am working on moving to maintaining stats in the {{SolrRequestInfo}} > thread-local, and adding hooks in get() methods of SolrCache implementations. > This will be glued up using the {{DebugComponent}} and can be requested using > a "debug.cache" parameter. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5637) Per-request cache statistics
[ https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876798#comment-13876798 ] Shikhar Bhushan edited comment on SOLR-5637 at 1/20/14 7:54 PM: Works in the distrib case now, though end up getting aggregate numbers out via {{DebugComponent#merge()} -- an enhancement might be to make the stats be part of the 'track' response from shards. was (Author: shikhar): works in the distrib case now > Per-request cache statistics > > > Key: SOLR-5637 > URL: https://issues.apache.org/jira/browse/SOLR-5637 > Project: Solr > Issue Type: New Feature >Reporter: Shikhar Bhushan >Priority: Minor > Attachments: SOLR-5367.patch, SOLR-5367.patch > > > We have found it very useful to have information on the number of cache hits > and misses for key Solr caches (filterCache, documentCache, etc.) at the > request level. > This is currently implemented in our codebase using custom {{SolrCache}} > implementations. > I am working on moving to maintaining stats in the {{SolrRequestInfo}} > thread-local, and adding hooks in get() methods of SolrCache implementations. > This will be glued up using the {{DebugComponent}} and can be requested using > a "debug.cache" parameter. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5637) Per-request cache statistics
[ https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated SOLR-5637: -- Attachment: SOLR-5367.patch works in the distrib case now > Per-request cache statistics > > > Key: SOLR-5637 > URL: https://issues.apache.org/jira/browse/SOLR-5637 > Project: Solr > Issue Type: New Feature > Reporter: Shikhar Bhushan >Priority: Minor > Attachments: SOLR-5367.patch, SOLR-5367.patch > > > We have found it very useful to have information on the number of cache hits > and misses for key Solr caches (filterCache, documentCache, etc.) at the > request level. > This is currently implemented in our codebase using custom {{SolrCache}} > implementations. > I am working on moving to maintaining stats in the {{SolrRequestInfo}} > thread-local, and adding hooks in get() methods of SolrCache implementations. > This will be glued up using the {{DebugComponent}} and can be requested using > a "debug.cache" parameter. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica
[ https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873851#comment-13873851 ] Shikhar Bhushan commented on SOLR-4260: --- This may be unrelated - I have not done much digging or looked at the full context, but was just looking at CUSS out of curiosity. Why do we flush() the OutputStream, but then write() on stuff like ending tags? Shouldn't the flush be after all those writes()'s? https://github.com/apache/lucene-solr/blob/lucene_solr_4_6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.java#L205 > Inconsistent numDocs between leader and replica > --- > > Key: SOLR-4260 > URL: https://issues.apache.org/jira/browse/SOLR-4260 > Project: Solr > Issue Type: Bug > Components: SolrCloud > Environment: 5.0.0.2013.01.04.15.31.51 >Reporter: Markus Jelsma >Assignee: Mark Miller >Priority: Critical > Fix For: 5.0, 4.7 > > Attachments: 192.168.20.102-replica1.png, > 192.168.20.104-replica2.png, clusterstate.png, > demo_shard1_replicas_out_of_sync.tgz > > > After wiping all cores and reindexing some 3.3 million docs from Nutch using > CloudSolrServer we see inconsistencies between the leader and replica for > some shards. > Each core hold about 3.3k documents. For some reason 5 out of 10 shards have > a small deviation in then number of documents. The leader and slave deviate > for roughly 10-20 documents, not more. > Results hopping ranks in the result set for identical queries got my > attention, there were small IDF differences for exactly the same record > causing a record to shift positions in the result set. During those tests no > records were indexed. Consecutive catch all queries also return different > number of numDocs. > We're running a 10 node test cluster with 10 shards and a replication factor > of two and frequently reindex using a fresh build from trunk. I've not seen > this issue for quite some time until a few days ago. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5637) Per-request cache statistics
[ https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated SOLR-5637: -- Attachment: SOLR-5367.patch first cut of patch attached for feedback. it needs to be made to work in the distributed case from {{DebugComponent}} and i'm still figuring things out for that. pointers appreciated :) > Per-request cache statistics > > > Key: SOLR-5637 > URL: https://issues.apache.org/jira/browse/SOLR-5637 > Project: Solr > Issue Type: New Feature >Reporter: Shikhar Bhushan >Priority: Minor > Attachments: SOLR-5367.patch > > > We have found it very useful to have information on the number of cache hits > and misses for key Solr caches (filterCache, documentCache, etc.) at the > request level. > This is currently implemented in our codebase using custom {{SolrCache}} > implementations. > I am working on moving to maintaining stats in the {{SolrRequestInfo}} > thread-local, and adding hooks in get() methods of SolrCache implementations. > This will be glued up using the {{DebugComponent}} and can be requested using > a "debug.cache" parameter. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5637) Per-request cache statistics
Shikhar Bhushan created SOLR-5637: - Summary: Per-request cache statistics Key: SOLR-5637 URL: https://issues.apache.org/jira/browse/SOLR-5637 Project: Solr Issue Type: New Feature Reporter: Shikhar Bhushan Priority: Minor We have found it very useful to have information on the number of cache hits and misses for key Solr caches (filterCache, documentCache, etc.) at the request level. This is currently implemented in our codebase using custom {{SolrCache}} implementations. I am working on moving to maintaining stats in the {{SolrRequestInfo}} thread-local, and adding hooks in get() methods of SolrCache implementations. This will be glued up using the {{DebugComponent}} and can be requested using a "debug.cache" parameter. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5629) SolrIndexSearcher.name should include core name
[ https://issues.apache.org/jira/browse/SOLR-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13870796#comment-13870796 ] Shikhar Bhushan commented on SOLR-5629: --- Thanks Erick! Yup, in SolrIndexSearcher constructor :) > SolrIndexSearcher.name should include core name > --- > > Key: SOLR-5629 > URL: https://issues.apache.org/jira/browse/SOLR-5629 > Project: Solr > Issue Type: Improvement > Reporter: Shikhar Bhushan >Assignee: Erick Erickson >Priority: Minor > > The name attribute on {{SolrIndexSearcher}} is used in log lines, but does > not include the core name. > So in a multi-core setup it is unnecessarily difficult to trace what core's > searcher is being referred to, e.g. in log lines that provide info on > searcher opens & closes. > One-line patch that helps: > Replace > {noformat} > this.name = "Searcher@" + Integer.toHexString(hashCode()) + (name!=null ? " > "+name : ""); > {noformat} > with > {noformat} > this.name = "Searcher@" + Integer.toHexString(hashCode()) + "[" + > core.getName() + "]" + (name!=null ? " "+name : ""); > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5629) SolrIndexSearcher.name should include core name
Shikhar Bhushan created SOLR-5629: - Summary: SolrIndexSearcher.name should include core name Key: SOLR-5629 URL: https://issues.apache.org/jira/browse/SOLR-5629 Project: Solr Issue Type: Improvement Reporter: Shikhar Bhushan Priority: Minor The name attribute on {{SolrIndexSearcher}} is used in log lines, but does not include the core name. So in a multi-core setup it is unnecessarily difficult to trace what core's searcher is being referred to, e.g. in log lines that provide info on searcher opens & closes. One-line patch that helps: Replace {noformat} this.name = "Searcher@" + Integer.toHexString(hashCode()) + (name!=null ? " "+name : ""); {noformat} with {noformat} this.name = "Searcher@" + Integer.toHexString(hashCode()) + "[" + core.getName() + "]" + (name!=null ? " "+name : ""); {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847146#comment-13847146 ] Shikhar Bhushan commented on LUCENE-5299: - Thanks for your comments Otis. I have certainly run into the situation of not seeing improvements when there is a higher degree of concurrency of search requests. So I want to try to pin down the associated costs (cost of merge, blocking operations, context switching, number/size of segments, etc.) I think this could have real-world applicability, but I don't have evidence yet in terms of a high query concurrency benchmark. Let's take as an example a 32-core server that serves 100 QPS at an average latency of 100ms. You'd expect 10 search tasks/threads to be active on average. So in theory you have 22 cores available for helping out with the search. > If this parallelization is optional and those who choose not to use it don't > suffer from it, then this may be a good option to have for those with > multi-core CPUs with low query concurrency, but if that's not the case It is optional and it is possible for parallelizable collectors to be written in a way that does not penalize the serial use case. E.g. the modifications to {{TopScoreDocCollector}} use a single {{PriorityQueue}} in the serial case, and a {{PriorityQueue}} for each {{AtomicReaderContext}} + 1 for the final merge in case parallelism is used. In the lucene-util benchmarks I ran I did not see a penalty on serial search with the patch. > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Shikhar Bhushan > Attachments: LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, > LUCENE-5299.patch, LUCENE-5299.patch, benchmarks.txt > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup
[ https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated SOLR-5505: -- Attachment: SOLR-5505.patch attaching patch with updated example solrconfig.xml > LoggingInfoStream not usabe in a multi-core setup > - > > Key: SOLR-5505 > URL: https://issues.apache.org/jira/browse/SOLR-5505 > Project: Solr > Issue Type: Bug >Affects Versions: 4.6 > Reporter: Shikhar Bhushan > Attachments: SOLR-5505.patch, SOLR-5505.patch > > > {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core > context. > Previously this was possible by encoding this into the infoStream's file path. > This means in a multi-core setup it is very hard to distinguish between the > infoStream messages for different cores. > {{LoggingInfoStream}} should be automatically configured to prepend the core > name to log messages. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup
[ https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837860#comment-13837860 ] Shikhar Bhushan edited comment on SOLR-5505 at 12/3/13 7:18 PM: Hi Ryan, * If the loggerName attribute is missing, it defaults to the fully-qualified class name of LoggingInfoStream (see default value used for getting the attribute). * I will update the example solrconfig.xml, good call! * Even if the logs are being sent to the same file, the logger name is almost always part of the formatter configuration. For the solrconfig.xml perhaps a good example would be {noformat} true {noformat} (I _think_ that actually will substitute core name correctly, will check...). was (Author: shikhar): Hi Ryan, * If the loggerName attribute is missing, it defaults to the fully-qualified class name of LoggingInfoStream (see default value used for getting the attribute). * I will update the example solrconfig.xml, good call! * Even if the logs are being sent to the same file, the logger name is almost always part of the formatter configuration. For the solrconfig.xml perhaps a good example would be {noformat} {noformat} (I _think_ that actually will substitute core name correctly, will check...). > LoggingInfoStream not usabe in a multi-core setup > - > > Key: SOLR-5505 > URL: https://issues.apache.org/jira/browse/SOLR-5505 > Project: Solr > Issue Type: Bug >Affects Versions: 4.6 >Reporter: Shikhar Bhushan > Attachments: SOLR-5505.patch > > > {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core > context. > Previously this was possible by encoding this into the infoStream's file path. > This means in a multi-core setup it is very hard to distinguish between the > infoStream messages for different cores. > {{LoggingInfoStream}} should be automatically configured to prepend the core > name to log messages. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup
[ https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837860#comment-13837860 ] Shikhar Bhushan commented on SOLR-5505: --- Hi Ryan, * If the loggerName attribute is missing, it defaults to the fully-qualified class name of LoggingInfoStream (see default value used for getting the attribute). * I will update the example solrconfig.xml, good call! * Even if the logs are being sent to the same file, the logger name is almost always part of the formatter configuration. For the solrconfig.xml perhaps a good example would be {{}} (I _think_ that actually will substitute core name correctly, will check...). > LoggingInfoStream not usabe in a multi-core setup > - > > Key: SOLR-5505 > URL: https://issues.apache.org/jira/browse/SOLR-5505 > Project: Solr > Issue Type: Bug >Affects Versions: 4.6 >Reporter: Shikhar Bhushan > Attachments: SOLR-5505.patch > > > {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core > context. > Previously this was possible by encoding this into the infoStream's file path. > This means in a multi-core setup it is very hard to distinguish between the > infoStream messages for different cores. > {{LoggingInfoStream}} should be automatically configured to prepend the core > name to log messages. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup
[ https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837860#comment-13837860 ] Shikhar Bhushan edited comment on SOLR-5505 at 12/3/13 4:30 PM: Hi Ryan, * If the loggerName attribute is missing, it defaults to the fully-qualified class name of LoggingInfoStream (see default value used for getting the attribute). * I will update the example solrconfig.xml, good call! * Even if the logs are being sent to the same file, the logger name is almost always part of the formatter configuration. For the solrconfig.xml perhaps a good example would be {noformat} {noformat} (I _think_ that actually will substitute core name correctly, will check...). was (Author: shikhar): Hi Ryan, * If the loggerName attribute is missing, it defaults to the fully-qualified class name of LoggingInfoStream (see default value used for getting the attribute). * I will update the example solrconfig.xml, good call! * Even if the logs are being sent to the same file, the logger name is almost always part of the formatter configuration. For the solrconfig.xml perhaps a good example would be {{}} (I _think_ that actually will substitute core name correctly, will check...). > LoggingInfoStream not usabe in a multi-core setup > - > > Key: SOLR-5505 > URL: https://issues.apache.org/jira/browse/SOLR-5505 > Project: Solr > Issue Type: Bug >Affects Versions: 4.6 >Reporter: Shikhar Bhushan > Attachments: SOLR-5505.patch > > > {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core > context. > Previously this was possible by encoding this into the infoStream's file path. > This means in a multi-core setup it is very hard to distinguish between the > infoStream messages for different cores. > {{LoggingInfoStream}} should be automatically configured to prepend the core > name to log messages. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup
[ https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated SOLR-5505: -- Attachment: SOLR-5505.patch Attaching patch against trunk. It does something different than what I proposed earlier: a) {{LoggingInfoStream}} constructor takes the slf4j {{Logger}} instance to be used as a constructor param. b) {{SolrIndexConfig}} checks if there is a "loggerName" configuration attribute on the "infoStream" tag, and if so this is used as the name for the {{Logger}}. Otherwise, the previous default of the {{LoggingInfoStream}} class name is used. This will enable users to manage the log output using their logging subsystem, e.g. the formatting pattern, to what log file etc. b) Additionally, I removed logging of the thread name from within {{LoggingInfoStream}}, since this is commonly configured at the level of the formatting patter for a logger. > LoggingInfoStream not usabe in a multi-core setup > - > > Key: SOLR-5505 > URL: https://issues.apache.org/jira/browse/SOLR-5505 > Project: Solr > Issue Type: Bug >Affects Versions: 4.6 >Reporter: Shikhar Bhushan > Attachments: SOLR-5505.patch > > > {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core > context. > Previously this was possible by encoding this into the infoStream's file path. > This means in a multi-core setup it is very hard to distinguish between the > infoStream messages for different cores. > {{LoggingInfoStream}} should be automatically configured to prepend the core > name to log messages. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup
[ https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832802#comment-13832802 ] Shikhar Bhushan commented on SOLR-5505: --- I'll create a patch today > LoggingInfoStream not usabe in a multi-core setup > - > > Key: SOLR-5505 > URL: https://issues.apache.org/jira/browse/SOLR-5505 > Project: Solr > Issue Type: Bug >Affects Versions: 4.6 >Reporter: Shikhar Bhushan > > {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core > context. > Previously this was possible by encoding this into the infoStream's file path. > This means in a multi-core setup it is very hard to distinguish between the > infoStream messages for different cores. > {{LoggingInfoStream}} should be automatically configured to prepend the core > name to log messages. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup
[ https://issues.apache.org/jira/browse/SOLR-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831937#comment-13831937 ] Shikhar Bhushan commented on SOLR-5505: --- This should be a simple patch, {{SolrIndexConfig}} can propagate the core name to the {{LoggingInfoStream}} constructor so that it's available for logging. > LoggingInfoStream not usabe in a multi-core setup > - > > Key: SOLR-5505 > URL: https://issues.apache.org/jira/browse/SOLR-5505 > Project: Solr > Issue Type: Bug >Affects Versions: 4.6 >Reporter: Shikhar Bhushan > > {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core > context. > Previously this was possible by encoding this into the infoStream's file path. > This means in a multi-core setup it is very hard to distinguish between the > infoStream messages for different cores. > {{LoggingInfoStream}} should be automatically configured to prepend the core > name to log messages. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4977) info stream in solrconfig should have option for writing to the solr log
[ https://issues.apache.org/jira/browse/SOLR-4977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831932#comment-13831932 ] Shikhar Bhushan commented on SOLR-4977: --- LoggingInfoStream does not log the core name which is an important piece of context - created SOLR-5505 for this > info stream in solrconfig should have option for writing to the solr log > > > Key: SOLR-4977 > URL: https://issues.apache.org/jira/browse/SOLR-4977 > Project: Solr > Issue Type: Improvement >Reporter: Ryan Ernst > Fix For: 4.4, 5.0 > > Attachments: SOLR-4977.patch, SOLR-4977.patch, SOLR-4977.patch, > SOLR-4977.patch, SOLR-4977.patch, SOLR-4977.patch, SOLR-4977.patch > > > Having a separate file is annoying, plus the print stream option doesn't > rollover on size or date, doesn't have custom formatting options, etc. > Exactly what the logging lib is meant to handle. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5505) LoggingInfoStream not usabe in a multi-core setup
Shikhar Bhushan created SOLR-5505: - Summary: LoggingInfoStream not usabe in a multi-core setup Key: SOLR-5505 URL: https://issues.apache.org/jira/browse/SOLR-5505 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: Shikhar Bhushan {{LoggingInfoStream}} that was introduced in SOLR-4977 does not log any core context. Previously this was possible by encoding this into the infoStream's file path. This means in a multi-core setup it is very hard to distinguish between the infoStream messages for different cores. {{LoggingInfoStream}} should be automatically configured to prepend the core name to log messages. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-5299: Attachment: LUCENE-5299.patch > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Shikhar Bhushan > Attachments: benchmarks.txt, LUCENE-5299.patch, LUCENE-5299.patch, > LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-5299: Attachment: LUCENE-5299.patch Attaching latest patch. Broken up into commits at https://github.com/shikhar/lucene-solr/compare/apache:trunk...trunk?w=1. > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Shikhar Bhushan > Attachments: benchmarks.txt, LUCENE-5299.patch, LUCENE-5299.patch, > LUCENE-5299.patch, LUCENE-5299.patch > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5363) NoClassDefFoundError when using Apache Log4J2
[ https://issues.apache.org/jira/browse/SOLR-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802926#comment-13802926 ] Shikhar Bhushan edited comment on SOLR-5363 at 10/23/13 3:03 PM: - Confirming the issue & Petar's assessment, ran into this as well was (Author: shikhar): Confirming the issue & the Petar's assessment, ran into this as well > NoClassDefFoundError when using Apache Log4J2 > - > > Key: SOLR-5363 > URL: https://issues.apache.org/jira/browse/SOLR-5363 > Project: Solr > Issue Type: Bug >Affects Versions: 4.5 >Reporter: Petar Tahchiev > Labels: log4j2 > Attachments: SOLR-5363.patch > > > Hey guys, > I'm using Log4J2 + SLF4J in my project. Unfortunately my embedded solr server > throws this error when starting: > {code} > Caused by: org.springframework.beans.factory.BeanDefinitionStoreException: > Factory method [public org.springframework.da > ta.solr.core.SolrOperations > com.x.platform.core.config.SolrsearchConfig.defaultSolrTemplate() throws > javax.xml.par > sers.ParserConfigurationException,java.io.IOException,org.xml.sax.SAXException] > threw exception; nested exception is org > .springframework.beans.factory.BeanCreationException: Error creating bean > with name 'defaultSolrServer' defined in class > path resource [com/x/platform/core/config/SolrsearchConfig.class]: > Instantiation of bean failed; nested exception > is org.springframework.beans.factory.BeanDefinitionStoreException: Factory > method [public org.apache.solr.client.solrj. > SolrServer > com.xx.platform.core.config.SolrsearchConfig.defaultSolrServer() throws > javax.xml.parsers.ParserConfigur > ationException,java.io.IOException,org.xml.sax.SAXException] threw exception; > nested exception is java.lang.NoClassDefFo > undError: org/apache/log4j/Priority > at > org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy > .java:181) > at > org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolv > er.java:570) > ... 105 more > Caused by: org.springframework.beans.factory.BeanCreationException: Error > creating bean with name 'defaultSolrServer' de > fined in class path resource > [com/xx/platform/core/config/SolrsearchConfig.class]: Instantiation of > bean failed; ne > sted exception is > org.springframework.beans.factory.BeanDefinitionStoreException: Factory > method [public org.apache.solr > .client.solrj.SolrServer > com.xxx.platform.core.config.SolrsearchConfig.defaultSolrServer() throws > javax.xml.parsers. > ParserConfigurationException,java.io.IOException,org.xml.sax.SAXException] > threw exception; nested exception is java.lan > g.NoClassDefFoundError: org/apache/log4j/Priority > at > org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolv > er.java:581) > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(Ab > stractAutowireCapableBeanFactory.java:1025) > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutow > ireCapableBeanFactory.java:921) > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCap > ableBeanFactory.java:487) > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapab > leBeanFactory.java:458) > at > org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:295) > at > org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegis > try.java:223) > at > org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:292) > at > org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194) > at > org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(Configurati > onClassEnhancer.java:298) > at > com.xx.platform.core.config.SolrsearchConfig$$EnhancerByCGLIB$$c571c5a6.defaultSolrServer() > at > com.x.platform.core.config.SolrsearchConfig.defaultS
[jira] [Commented] (SOLR-5363) NoClassDefFoundError when using Apache Log4J2
[ https://issues.apache.org/jira/browse/SOLR-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802926#comment-13802926 ] Shikhar Bhushan commented on SOLR-5363: --- Confirming the issue & the Petar's assessment, ran into this as well > NoClassDefFoundError when using Apache Log4J2 > - > > Key: SOLR-5363 > URL: https://issues.apache.org/jira/browse/SOLR-5363 > Project: Solr > Issue Type: Bug >Affects Versions: 4.5 >Reporter: Petar Tahchiev > Labels: log4j2 > Attachments: SOLR-5363.patch > > > Hey guys, > I'm using Log4J2 + SLF4J in my project. Unfortunately my embedded solr server > throws this error when starting: > {code} > Caused by: org.springframework.beans.factory.BeanDefinitionStoreException: > Factory method [public org.springframework.da > ta.solr.core.SolrOperations > com.x.platform.core.config.SolrsearchConfig.defaultSolrTemplate() throws > javax.xml.par > sers.ParserConfigurationException,java.io.IOException,org.xml.sax.SAXException] > threw exception; nested exception is org > .springframework.beans.factory.BeanCreationException: Error creating bean > with name 'defaultSolrServer' defined in class > path resource [com/x/platform/core/config/SolrsearchConfig.class]: > Instantiation of bean failed; nested exception > is org.springframework.beans.factory.BeanDefinitionStoreException: Factory > method [public org.apache.solr.client.solrj. > SolrServer > com.xx.platform.core.config.SolrsearchConfig.defaultSolrServer() throws > javax.xml.parsers.ParserConfigur > ationException,java.io.IOException,org.xml.sax.SAXException] threw exception; > nested exception is java.lang.NoClassDefFo > undError: org/apache/log4j/Priority > at > org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy > .java:181) > at > org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolv > er.java:570) > ... 105 more > Caused by: org.springframework.beans.factory.BeanCreationException: Error > creating bean with name 'defaultSolrServer' de > fined in class path resource > [com/xx/platform/core/config/SolrsearchConfig.class]: Instantiation of > bean failed; ne > sted exception is > org.springframework.beans.factory.BeanDefinitionStoreException: Factory > method [public org.apache.solr > .client.solrj.SolrServer > com.xxx.platform.core.config.SolrsearchConfig.defaultSolrServer() throws > javax.xml.parsers. > ParserConfigurationException,java.io.IOException,org.xml.sax.SAXException] > threw exception; nested exception is java.lan > g.NoClassDefFoundError: org/apache/log4j/Priority > at > org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolv > er.java:581) > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(Ab > stractAutowireCapableBeanFactory.java:1025) > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutow > ireCapableBeanFactory.java:921) > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCap > ableBeanFactory.java:487) > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapab > leBeanFactory.java:458) > at > org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:295) > at > org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegis > try.java:223) > at > org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:292) > at > org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194) > at > org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(Configurati > onClassEnhancer.java:298) > at > com.xx.platform.core.config.SolrsearchConfig$$EnhancerByCGLIB$$c571c5a6.defaultSolrServer() > at > com.x.platform.core.config.SolrsearchConfig.defaultSolrTemplate(SolrsearchConfig.java:37) > at > com.xx.platform.core.config.SolrsearchConfig$$EnhancerByCGLIB$$c571c5a6.CGLIB$defaultSolrTemplate$2( erated&
[jira] [Updated] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-5299: Attachment: LUCENE-5299.patch > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Shikhar Bhushan > Attachments: benchmarks.txt, LUCENE-5299.patch, LUCENE-5299.patch, > LUCENE-5299.patch > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-5299: Attachment: LUCENE-5299.patch Attaching patch with the TopFieldCollector changes + removal of bunch of unnecessary code from IndexSearcher Tests pass except for TestExpressionSorts sometimes (see LUCENE-5222), will reopen that and provide fix. > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Shikhar Bhushan > Attachments: benchmarks.txt, LUCENE-5299.patch, LUCENE-5299.patch > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801028#comment-13801028 ] Shikhar Bhushan commented on LUCENE-5299: - bq. What do you have the number of search threads set to in luceneutil? I did not change any of the defaults - what setting is this? bq. If this is too low, maybe its not utilizing all your hardware in the benchmark. (like a web server with a too-small PQ) What's a PQ? :) > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Shikhar Bhushan > Attachments: benchmarks.txt, LUCENE-5299.patch > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801019#comment-13801019 ] Shikhar Bhushan edited comment on LUCENE-5299 at 10/21/13 8:08 PM: --- I'm planning to work on parallelizing TopFieldCollector in the same way as for TopScoreDocCollector, so the special-casing from IndexSearcher can be removed and searches are parallelizable even if that collector gets wrapped in something else by Solr. We are going to be doing some load-tests and latency measurements on one of our experimental clusters using real traffic logs, and I will report those findings. But first need to do that work on TopFieldCollector as most of our requests have multiple sort fields. was (Author: shikhar): I'm planning to work on parallelizing TopFieldCollector in the same way as for TopScoreDocCollector, so the special-casing from IndexSearcher can be removed and searches are parallelizable even if that collector gets wrapped in something else by Solr. We am going to be doing some load-tests and latency measurements on one of our experimental clusters using real traffic logs, and I will report those findings. But first need to do that work on TopFieldCollector as most of our requests have multiple sort fields. > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Shikhar Bhushan > Attachments: benchmarks.txt, LUCENE-5299.patch > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801019#comment-13801019 ] Shikhar Bhushan commented on LUCENE-5299: - I'm planning to work on parallelizing TopFieldCollector in the same way as for TopScoreDocCollector, so the special-casing from IndexSearcher can be removed and searches are parallelizable even if that collector gets wrapped in something else by Solr. We am going to be doing some load-tests and latency measurements on one of our experimental clusters using real traffic logs, and I will report those findings. But first need to do that work on TopFieldCollector as most of our requests have multiple sort fields. > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Shikhar Bhushan > Attachments: benchmarks.txt, LUCENE-5299.patch > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800954#comment-13800954 ] Shikhar Bhushan commented on LUCENE-5299: - Thanks for your comments [~thetaphi], I really appreciate the vote of confidence in the API changes :) bq. My biggest concern is not complexity of API (it is actually simplier and easier to understand!): it is more the fact that parallelism of Lucene Queries is in most cases not the best thing to do (if you have many users). It only makes sense if you have very few queries - which is not where full-text searches are used for. The overhead for merging is higher than what you get, especially when many users hit your search engine in parallel! I generally don't recommend to users to use the parallelization currently available in IndexSearcher. Every user gets one thread and if you have many users buy more processors. With additional parallelism this does not scale if userbase grows. There is certainly more work to be done overall per search-request for the Collector's where parallelization => merge step(s) [1]. It could mean better latency at the cost of additional hardware to sustain the same level of load. But it's a choice that should be available when developing search applications. [1] there are trivially parallelizable collectors where the merge step is either really small or non-existent: e.g. TotalHitCountCollector, or even FacetCollector (https://github.com/shikhar/lucene-solr/commit/032683da739bf15c1a8afe9f15cb2586baa0b201?w=1) > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Shikhar Bhushan > Attachments: benchmarks.txt, LUCENE-5299.patch > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800947#comment-13800947 ] Shikhar Bhushan commented on LUCENE-5299: - bq. Could you describe a bit about the high level design changes? There is an overview in this email under 'Idea': http://mail-archives.apache.org/mod_mbox/lucene-dev/201310.mbox/%3CCAE_Gd_dt6LY5T9r6ty%2B1j2xEbdr84OCPkU5swsQn10cbDt81Ew%40mail.gmail.com%3E bq. In the benchmarks, is "par vs par" the before/after test? Ie baseline = current trunk, passed an ES to IndexSearcher, and then comp = with this patch, also passing ES to IndexSearcher? Exactly, sorry that wasn't made clear. bq. In general, I suspect fine grained parallelism is trickier / most costly then the "merge in the end" parallelism we have now. Typically collection is not a very costly part of the search ... and merging the results in the end should be a minor cost, that shrinks as the index gets larger. "Typically collection is not a very costly part of the search" - I don't know if that's true. Are you referring to just the bits that might happen inside a Collector, or a broader definition of collection as including scoring and potentially some degree of I/O? This change is aiming to parallelize the latter. To do this the Collector API needs refactoring to cleanly separate out the AtomicReader-level state and the composite state, in case they are different. > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Shikhar Bhushan > Attachments: benchmarks.txt, LUCENE-5299.patch > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Deleted] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-5299: Comment: was deleted (was: patch and benchmarks to come...) > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Shikhar Bhushan > Attachments: benchmarks.txt, LUCENE-5299.patch > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated LUCENE-5299: Attachment: benchmarks.txt LUCENE-5299.patch attaching patch against trunk + benchmarks > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Shikhar Bhushan > Attachments: benchmarks.txt, LUCENE-5299.patch > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Refactoring Collector interface for parallelism
Thanks Shawn! I have created these 2 tickets: https://issues.apache.org/jira/browse/LUCENE-5299 (the refactoring) https://issues.apache.org/jira/browse/SOLR-5372 (so Solr can take advantage of it) On Mon, Oct 21, 2013 at 9:48 AM, Shawn Heisey wrote: > On 10/21/2013 7:10 AM, Shikhar Bhushan wrote: > > I wanted to add a note as to the motivation for these changes. > > Essentially we should be able to scale-up better with Solr/Lucene and > > not just scale-out by sharding. With sharding one enters distributed > > systems territory with all the pitfalls and failure conditions, which is > > not ideal especially if your index is not large enough to warrant > > it. The next frontier seems to be to utilize multiple CPU cores for > > performing a search request's legwork, and the results are very > > promising - 2-3x speedups in many cases! > > > > Would be great to get a sense of committer interest :) Let me know if I > > should just open a JIRA with patches. > > Although I do not have much understanding of the deep internals, I think > I can safely say this: It is almost never a mistake to open an issue in > Jira and attach a patch, even if the idea ultimately never gets used. > > In cases where a patch shows significant performance gains, the code > needs careful review to make sure that it doesn't break anything or make > radical changes to the API. As you might know, major changes to the API > are typically reserved for the next major release, unless the old API > can easily be kept and deprecated. > > If it's a very good patch, the performance improvement will still be > significant after any problems are fixed. > > Thanks, > Shawn > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800762#comment-13800762 ] Shikhar Bhushan commented on LUCENE-5299: - patch and benchmarks to come... > Refactor Collector API for parallelism > -- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Shikhar Bhushan > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5372) SolrIndexSearcher should support propagating an ExecutorService upto IndexSearcher constructor
Shikhar Bhushan created SOLR-5372: - Summary: SolrIndexSearcher should support propagating an ExecutorService upto IndexSearcher constructor Key: SOLR-5372 URL: https://issues.apache.org/jira/browse/SOLR-5372 Project: Solr Issue Type: Improvement Reporter: Shikhar Bhushan This could probably be made this configurable from solrconfig.xml. We should be able to easily configure the kind of executor to be chosen with params. The idea here is to benefit from improvements being proposed in LUCENE-5299 -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5299) Refactor Collector API for parallelism
Shikhar Bhushan created LUCENE-5299: --- Summary: Refactor Collector API for parallelism Key: LUCENE-5299 URL: https://issues.apache.org/jira/browse/LUCENE-5299 Project: Lucene - Core Issue Type: Improvement Reporter: Shikhar Bhushan h2. Motivation We should be able to scale-up better with Solr/Lucene by utilizing multiple CPU cores, and not have to resort to scaling-out by sharding (with all the associated distributed system pitfalls) when the index size does not warrant it. Presently, IndexSearcher has an optional constructor arg for an ExecutorService, which gets used for searching in parallel for call paths where one of the TopDocCollector's is created internally. The per-atomic-reader search happens in parallel and then the TopDocs/TopFieldDocs results are merged with locking around the merge bit. However there are some problems with this approach: * If arbitary Collector args come into play, we can't parallelize. Note that even if ultimately results are going to a TopDocCollector it may be wrapped inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. * The special-casing with parallelism baked on top does not scale, there are many Collector's that could potentially lend themselves to parallelism, and special-casing means the parallelization has to be re-implemented if a different permutation of collectors is to be used. h2. Proposal A refactoring of collectors that allows for parallelization at the level of the collection protocol. Some requirements that should guide the implementation: * easy migration path for collectors that need to remain serial * the parallelization should be composable (when collectors wrap other collectors) * allow collectors to pick the optimal solution (e.g. there might be memory tradeoffs to be made) by advising the collector about whether a search will be parallelized, so that the serial use-case is not penalized. * encourage use of non-blocking constructs and lock-free parallelism, blocking is not advisable for the hot-spot of a search, besides wasting pooled threads. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Refactoring Collector interface for parallelism
I wanted to add a note as to the motivation for these changes. Essentially we should be able to scale-up better with Solr/Lucene and not just scale-out by sharding. With sharding one enters distributed systems territory with all the pitfalls and failure conditions, which is not ideal especially if your index is not large enough to warrant it. The next frontier seems to be to utilize multiple CPU cores for performing a search request's legwork, and the results are very promising - 2-3x speedups in many cases! Would be great to get a sense of committer interest :) Let me know if I should just open a JIRA with patches. On Sat, Oct 19, 2013 at 6:18 PM, Shikhar Bhushan wrote: > I got inspired by Uwe Schindler's talk "Is your index reader really > atomic or maybe > slow?"<http://www.lucenerevolution.org/sites/default/files/Schindler%20-%20IsYourIndexReaderReallyAtomicOrMaybeSlow_0.pdf>to > look into the state of parallelism when searching across segments. It > seems there has been some effort to do this but we never went all the way. > > IndexSearcher has an optional constructor arg for an ExecutorService, > which gets used for searching in parallel for call paths where one of the > TopDocCollector's is created internally. The per-segment search happens in > parallel and then the TopDocs/TopFieldDocs results are merged with locking > around the merge bit. > > What's the upside when performing parallel search? I benchmarked using > luceneutil on WIKI_MEDIUM_10M with trunk vs trunk with IndexSearcher > constructor hacked to initialize a > ForkJoinPool<https://gist.github.com/anonymous/7048089> > . > > Report after iter 19: > TaskQPS baseline StdDev QPS patched StdDev > Pct diff > Respell 63.05 (3.0%) 46.37 (2.9%) > -26.5% ( -31% - -21%) > PKLookup 255.80 (2.6%) 194.11 (7.4%) > -24.1% ( -33% - -14%) > Fuzzy2 54.66 (3.8%) 42.28 (3.5%) > -22.7% ( -28% - -16%) > Fuzzy1 75.35 (2.9%) 59.23 (4.2%) > -21.4% ( -27% - -14%) > MedSloppyPhrase 141.74 (3.6%) 124.56 (9.7%) > -12.1% ( -24% -1%) > AndHighLow 922.70 (3.1%) 908.18 (10.0%) > -1.6% ( -14% - 11%) > LowSloppyPhrase 88.50 (4.2%) 106.64 (13.0%) > 20.5% ( 3% - 39%) > LowTerm 694.97 (2.7%) 949.30 (21.6%) > 36.6% ( 11% - 62%) > MedSpanNear 102.47 (3.0%) 160.94 (12.7%) > 57.1% ( 40% - 75%) > OrNotHighLow 98.58 (4.9%) 155.22 (18.1%) > 57.5% ( 32% - 84%) > AndHighMed 260.31 (1.7%) 448.19 (23.5%) > 72.2% ( 46% - 99%) >OrHighLow 101.68 (6.7%) 179.91 (27.8%) > 76.9% ( 39% - 119%) > MedTerm 200.49 (3.1%) 365.70 (32.8%) > 82.4% ( 45% - 122%) > HighTerm 145.36 (3.3%) 268.51 (27.7%) > 84.7% ( 51% - 119%) > Prefix3 81.59 (2.8%) 153.52 (24.8%) > 88.2% ( 58% - 119%) > OrHighNotLow 50.67 (6.9%) 95.91 (22.5%) > 89.3% ( 56% - 127%) >OrHighMed 93.37 (6.3%) 182.37 (24.3%) > 95.3% ( 60% - 134%) > OrHighNotMed 78.19 (6.7%) 153.30 (30.7%) > 96.1% ( 54% - 143%) > OrHighHigh 46.59 (7.0%) 92.33 (31.7%) > 98.2% ( 55% - 147%) >OrHighNotHigh 51.01 (5.7%) 105.32 (25.1%) > 106.5% ( 71% - 145%) > Wildcard 78.53 (3.7%) 168.42 (40.8%) > 114.5% ( 67% - 164%) >OrNotHighHigh 40.42 (5.5%) 93.45 (29.5%) > 131.2% ( 91% - 175%) > OrNotHighMed 24.40 (4.8%) 57.00 (22.0%) > 133.6% ( 102% - 168%) > HighSpanNear4.13 (3.6%) 10.09 (10.5%) > 144.3% ( 125% - 164%) > LowSpanNear 14.27 (2.3%) 35.24 (17.0%) > 147.0% ( 124% - 170%) > IntNRQ8.73 (4.8%) 21.69 (23.6%) > 148.6% ( 114% - 185%) >LowPhrase 29.11 (2.9%) 72.87 (22.4%) > 150.3% ( 121% - 180%) >MedPhrase 33.19 (3.4%) 83.38 (40.4%) > 151.3% ( 103% - 201%) > HighSloppyPhrase6.46 (5.8%) 16.37 (21.9%) > 153.5% ( 119% - 192%) > HighPhrase 10.00 (6.8%) 25.66
Refactoring Collector interface for parallelism
Idea* * * I have been exploring a refactoring of Collector's that allows for parallelization at the level of the collection protocol. Some of the design decisions: - easy migration path for collectors that want to remain serial - the parallelization should be composable (when Collector's wrap other Collector's) - allow collector's to pick the optimal solution (e.g. there might be memory tradeoffs to be made) by advising the collector about whether a search will be parallelized - encourage use of lock-free parallelism by providing for a segment-level done() method that is guaranteed to execute in a single-threaded manner (so if you were to accumulate some state at the segment-level, it can be safely merged with the composite state without use of locking) The code is currently in a fork on github <https://github.com/shikhar/lucene-solr/compare/apache:trunk...trunk>. - Collector goes from being an abstract class to this interface <https://github.com/shikhar/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/Collector.java>. A new interface SubCollector <https://github.com/shikhar/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/SubCollector.java> is introduced. - The existing subclasses of Collector can choose to either implement the new interface, or subclass a SerialCollector <https://github.com/shikhar/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/SerialCollector.java> abstract class which maintains essentially the same API as before. - There is javadoc on the new interfaces, but the new collection protocol is probably clearer in connection with the usage from IndexSearcher <https://github.com/shikhar/lucene-solr/blob/b3098fc/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L571-L636>. - I tried to convert at least all of the collectors that wrap other collectors over to the new interface, rather than simply extending SerialCollector (also introduced a new WrappingCollector <https://github.com/shikhar/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/WrappingCollector.java> base class), so that the wrappers themselves aren't blockers from a search being parallelizable. There are probably a few cases I have missed. - An example of a collector that is very easily parallelizable in the divide-and-conquer style: TotalHitCountCollector <https://github.com/shikhar/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollector.java> - An update of TopScoreDocCollector to being a parallelizable Collector <https://github.com/shikhar/lucene-solr/commit/b3098fcfdeb302481e93aab93eb77d0cabc72f9b?w=1>, and corresponding removal of the special-cased parallelism inside IndexSearcher. Benchmark results <https://gist.github.com/shikhar/7062026> *Next steps* * * I would love to get your feedback on the idea and implementation, and how this could be incorporated into trunk. Code review would be much appreciated. Some follow-up work that needs to be done (help needed!): - Making TopFieldCollector a parallelizable Collector, like was done for TopScoreDocCollector, and removal of the special-casing inside IndexSearcher. - Solr support -- perhaps making the Executor to be used by SolrIndexSearcher configurable from solrconfig.xml. - Making more collector's parallelizable - lots of opportunities here. I guess this will be an ongoing process. Looking forward to your thoughts! Thanks, Shikhar Bhushan
Re: [jira] [Commented] (LUCENE-5189) Numeric DocValues Updates
yho On Sunday, October 6, 2013, Shai Erera (JIRA) wrote:u > > [ > https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787730#comment-13787730] > > Shai Erera commented on LUCENE-5189: > > > bq. If SR itself doesnt need ref-counting, perhaps we can pull this out of > SR then? (rote-refactor into DV-thingy or something). > > You mean something like SegmentDocValues? It's doable I guess. SR would > need to keep track of the DV.gens it uses though, so that in SR.doClose it > can call segDV.decRef(gens) so that the latter can decRef all the DVPs that > are used for these gens. If it also removes a gen from the map when it's no > longer referenced by any SR, we don't need to take care of clearing the > genDVP map when all SRs were closed (otherwise I think we'll need to > refCount SegDV too, like SCR). > > I'll give it a shot. > > > Numeric DocValues Updates > > - > > > > Key: LUCENE-5189 > > URL: https://issues.apache.org/jira/browse/LUCENE-5189 > > Project: Lucene - Core > > Issue Type: New Feature > > Components: core/index > >Reporter: Shai Erera > >Assignee: Shai Erera > > Attachments: LUCENE-5189-4x.patch, LUCENE-5189.patch, > LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, > LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, > LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189_process_events.patch, > LUCENE-5189_process_events.patch, LUCENE-5189-updates-order.patch, > LUCENE-5189-updates-order.patch > > > > > > In LUCENE-4258 we started to work on incremental field updates, however > the amount of changes are immense and hard to follow/consume. The reason is > that we targeted postings, stored fields, DV etc., all from the get go. > > I'd like to start afresh here, with numeric-dv-field updates only. There > are a couple of reasons to that: > > * NumericDV fields should be easier to update, if e.g. we write all the > values of all the documents in a segment for the updated field (similar to > how livedocs work, and previously norms). > > * It's a fairly contained issue, attempting to handle just one data type > to update, yet requires many changes to core code which will also be useful > for updating other data types. > > * It has value in and on itself, and we don't need to allow updating all > the data types in Lucene at once ... we can do that gradually. > > I have some working patch already which I'll upload next, explaining the > changes. > > > > -- > This message was sent by Atlassian JIRA > (v6.1#6144) > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13765739#comment-13765739 ] Shikhar Bhushan commented on SOLR-4816: --- Thanks Mark! Also for adding call to lbServer.shutdown() when appropriate. This is a really minor thing, but I later realized {{final Map>> responseFutures = new HashMap>>();}} is better declared with an initialCapacity as that is known {{final Map>> responseFutures = new HashMap>>(routes.size());}} > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 4.5, 5.0 > > Attachments: RequestTask-removal.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Optional parallel update execution: Updates for each shard are executed in > a separate thread so parallel indexing can occur across the cluster. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > cloudClient.setParallelUpdates(true); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4998) be more precise about IOContext for reads
[ https://issues.apache.org/jira/browse/LUCENE-4998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13765122#comment-13765122 ] Shikhar Bhushan commented on LUCENE-4998: - [~mikemccand] maybe this patch is up your alley? > be more precise about IOContext for reads > - > > Key: LUCENE-4998 > URL: https://issues.apache.org/jira/browse/LUCENE-4998 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Shikhar Bhushan >Priority: Minor > Fix For: 5.0, 4.5 > > Attachments: LUCENE-4998.patch > > > Set the context as {{IOContext.READ}} / {{IOContext.READONCE}} where > applicable > > Motivation: > Custom {{PostingsFormat}} may want to check the context on > {{SegmentReadState}} and branch differently, but for this to work properly > the context has to be specified correctly up the stack. > For example, {{DirectPostingsFormat}} only loads postings into memory if the > {{context != MERGE}}. However a better condition would be {{context == > Context.READ && !context.readOnce}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13765120#comment-13765120 ] Shikhar Bhushan commented on SOLR-4816: --- This is a separate issue but worth noting: CloudSolrServer.shutdown() does not call lbServer.shutdown() In case the lbServer is provided as a constructor arg from outside that probably make sense. But in case of the constructors where it is created internally, IMO CloudSolrServer should assume ownership and also shut it down. > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 4.5, 5.0 > > Attachments: RequestTask-removal.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Optional parallel update execution: Updates for each shard are executed in > a separate thread so parallel indexing can occur across the cluster. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > cloudClient.setParallelUpdates(true); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13765114#comment-13765114 ] Shikhar Bhushan commented on SOLR-4816: --- We've run into some issues with CloudSolrServer leaking loads of LBHttpSolrServer's aliveCheckExecutor thread pools with {{parallelUpdates = true}}. The root cause here is that the RequestTask inner class is creating a new LBHttpSolrServer for each run() rather than utilizing CloudSolrServer.lbServer which is already available to it. Some detail: LBHttpSolrServer lazily initializes a single-threaded ScheduledExecutorService for the "aliveCheckExecutor" when e.g. there is some kind of error talking to a server. So this issue tends to come up when Solr nodes are unavailable and exceptions are thrown. There is also no call to shutdown() on that LBHttpSolrServer which gets created from RequestTask.run(). LBHttpSolrServer does have a finalizer that tries to shutdown the aliveCheckExecutor but there's no guarantee of finalizers executing (or maybe there is some other memory leak preventing that LBHttpSolrServer from being GC'ed at all). So the one-liner fix that should definitely go in is to simply have RequestTask use CloudSolrServer.lbServer. I have attached a patch that removes RequestTask altogether in favor of simply using Callable's and Future's which is much more idiomatic. (RequestTask-removal.patch) > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 4.5, 5.0 > > Attachments: RequestTask-removal.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Optional parallel update execution: Updates for each shard are executed in > a separate thread so parallel indexing can occur across the cluster. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > cloudClient.setParallelUpdates(true); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shikhar Bhushan updated SOLR-4816: -- Attachment: RequestTask-removal.patch > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 4.5, 5.0 > > Attachments: RequestTask-removal.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Optional parallel update execution: Updates for each shard are executed in > a separate thread so parallel indexing can occur across the cluster. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > cloudClient.setParallelUpdates(true); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3852) Admin UI - Cloud Tree with HTTP-Status 500 and an ArrayIndexOutOfBoundsException when using external ZK
[ https://issues.apache.org/jira/browse/SOLR-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719193#comment-13719193 ] Shikhar Bhushan commented on SOLR-3852: --- We've run into this. Was an ArrayIndexOutOfBoundsException arising out of: https://github.com/apache/lucene-solr/blob/4ce168a/solr/core/src/java/org/apache/solr/servlet/ZookeeperInfoServlet.java#L303 We have some znodes storing binary data but that bit above assumes that if a znode has data, it'll be a UTF-8 encoded string. That block doesn't actually do anything post-decode, so maybe it should just be removed. > Admin UI - Cloud Tree with HTTP-Status 500 and an > ArrayIndexOutOfBoundsException when using external ZK > --- > > Key: SOLR-3852 > URL: https://issues.apache.org/jira/browse/SOLR-3852 > Project: Solr > Issue Type: Bug >Affects Versions: 4.0-BETA > Environment: Tomcat 6, external zookeeper-3.3.5 >Reporter: Vadim Kisselmann > > It works with embedded ZK. > But when we use an external ZK(3.3.5), and this ZK has another nodes like > (hbase, broker, etc. and child-nodes with not specified formats) we get this > Error in Admin UI in the "Cloud-Tree" View: Loading of undefined failed with > HTTP-Status 500 . > Important(!): The cluster still works. Our external ZK see the Solr Servers > (live-nodes) and has the solr config files from initial import. All the nodes > like collections, configs, overseer-elect are here. > Only the Admin UI has a problem to show the "Cloud-Tree". Cloud-Graph works! > Catalina-LogFiles are free from Error messages, i have only this stack trace > from Firebug: > Apache Tomcat/6.0.28 - Error report<!--H1 > {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} > H2 > {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} > H3 > {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} > BODY > {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B > {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P > {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A > {color : black;}A.name {color : black;}HR {color : #525D76;}--> > HTTP Status 500 - noshade="noshade">type Exception reportmessage > description The server encountered an internal error > () that prevented it from fulfilling this request.exception > java.lang.ArrayIndexOutOfBoundsException > note The full stack trace of the root cause is > available in the Apache Tomcat/6.0.28 logs. noshade="noshade">Apache Tomcat/6.0.28 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4379) solr-core has a dependency to slf4j-jdk14 and is not binding agnostic
[ https://issues.apache.org/jira/browse/SOLR-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13694290#comment-13694290 ] Shikhar Bhushan commented on SOLR-4379: --- context http://wiki.apache.org/solr/SolrLogging#Solr_4.3_and_above > solr-core has a dependency to slf4j-jdk14 and is not binding agnostic > - > > Key: SOLR-4379 > URL: https://issues.apache.org/jira/browse/SOLR-4379 > Project: Solr > Issue Type: Improvement >Affects Versions: 4.1 >Reporter: Nicolas Labrot >Priority: Minor > > solr-core can be used as a dependency in other projects which used others > binding. In these cases slf4j-jdk14 must be excluded > In my opinion it may be better to move the slf4j-jdk14 dependency from > solr-core to the war project. > solr-core will be binding agnostics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4379) solr-core has a dependency to slf4j-jdk14 and is not binding agnostic
[ https://issues.apache.org/jira/browse/SOLR-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13694285#comment-13694285 ] Shikhar Bhushan commented on SOLR-4379: --- looks like this is fixed in 4.3.1 at least: http://repo1.maven.org/maven2/org/apache/solr/solr-core/4.3.1/solr-core-4.3.1.pom has no mention of {{slf4j-jdk14}} > solr-core has a dependency to slf4j-jdk14 and is not binding agnostic > - > > Key: SOLR-4379 > URL: https://issues.apache.org/jira/browse/SOLR-4379 > Project: Solr > Issue Type: Improvement >Affects Versions: 4.1 >Reporter: Nicolas Labrot >Priority: Minor > > solr-core can be used as a dependency in other projects which used others > binding. In these cases slf4j-jdk14 must be excluded > In my opinion it may be better to move the slf4j-jdk14 dependency from > solr-core to the war project. > solr-core will be binding agnostics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org