[jira] [Created] (SOLR-12765) Possibly incorrect format in JMX cache stats
Bojan Smid created SOLR-12765: - Summary: Possibly incorrect format in JMX cache stats Key: SOLR-12765 URL: https://issues.apache.org/jira/browse/SOLR-12765 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Affects Versions: 7.4 Reporter: Bojan Smid I posted a question on ML [https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3CCAGniRXR4Ps%3D03X0uiByCn5ecUT2VY4TLV4iNcxCde3dxBnmC-w%40mail.gmail.com%3E|https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3CCAGniRXR4Ps%3D03X0uiByCn5ecUT2VY4TLV4iNcxCde3dxBnmC-w%40mail.gmail.com%3E),] , but didn't get feedback. Since it looks like a possible bug, I am opening a ticket. It seems the format of cache mbeans changed with 7.4.0. And from what I see similar change wasn't made for other mbeans, which may mean it was accidental and may be a bug. In Solr 7.3.* format was (each attribute on its own, numeric type): mbean: solr:dom1=core,dom2=gettingstarted,dom3=shard1,dom4=replica_n1,category=CACHE,scope=searcher,name=filterCache attributes: lookups java.lang.Long = 0 hits java.lang.Long = 0 cumulative_evictions java.lang.Long = 0 size java.lang.Long = 0 hitratio java.lang.Float = 0.0 evictions java.lang.Long = 0 cumulative_lookups java.lang.Long = 0 cumulative_hitratio java.lang.Float = 0.0 warmupTime java.lang.Long = 0 inserts java.lang.Long = 0 cumulative_inserts java.lang.Long = 0 cumulative_hits java.lang.Long = 0 With 7.4.0 there is a single attribute "Value" (java.lang.Object): mbean: solr:dom1=core,dom2=gettingstarted,dom3=shard1,dom4=replica_n1,category=CACHE,scope=searcher,name=filterCache attributes: Value java.lang.Object = \{lookups=0, evictions=0, cumulative_inserts=0, cumulative_hits=0, hits=0, cumulative_evictions=0, size=0, hitratio=0.0, cumulative_lookups=0, cumulative_hitratio=0.0, warmupTime=0, inserts=0} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-11323) Expose cache maxSize and autowarm settings in JMX
Bojan Smid created SOLR-11323: - Summary: Expose cache maxSize and autowarm settings in JMX Key: SOLR-11323 URL: https://issues.apache.org/jira/browse/SOLR-11323 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Affects Versions: 7.0, 7.1 Reporter: Bojan Smid Before Solr 7.*, cache maxSize and autowarm settings were exposed in JMX along with cache metrics. There was a textual attribute "description" which could be parsed to extract maxSize and autowarm settings. This was very useful for various monitoring tools since maxSize and autowarm could then be displayed on monitoring charts (one could for example compare current size of some cache to its maxSize without digging through configs to find this setting). Ideally maxSize and autowarm count/% would be exposed as two separate attributes, but having a single description field (which can be parsed) would also be better than nothing. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10226) JMX metric avgTimePerRequest broken
[ https://issues.apache.org/jira/browse/SOLR-10226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897818#comment-15897818 ] Bojan Smid commented on SOLR-10226: --- I tested the patch quickly, metric totalTime is now there, but there is one small problem - it is expressed in ns. To be backward compatible it should be in ms. > JMX metric avgTimePerRequest broken > --- > > Key: SOLR-10226 > URL: https://issues.apache.org/jira/browse/SOLR-10226 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Affects Versions: 6.4.1 >Reporter: Bojan Smid >Assignee: Andrzej Bialecki > Attachments: SOLR-10226.patch > > > JMX Metric avgTimePerRequest (of > org.apache.solr.handler.component.SearchHandler) doesn't appear to behave > correctly anymore. It was a cumulative value in pre-6.4 versions. Since > totalTime metric was removed (which was a base for monitoring calculations), > avgTimePerRequest seems like possible alternative to calculate "time spent in > requests since last measurement", but it behaves strangely after 6.4. > I did a simple test on gettingstarted collection (just unpacked the Solr > 6.4.1 version and started it with "bin/solr start -e cloud -noprompt"). The > query I used was: > http://localhost:8983/solr/gettingstarted/select?indent=on=*:*=json > I run it 30 times in a row (with approx 1 sec between executions). > At the same time I was looking (with jconsole) at bean > solr/gettingstarted_shard2_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler > Here is how metric was changing over time (first number is "requests" metric, > second number is "avgTimePerRequest"): > 10 6.6033 > 12 5.9557 > 13 0.9015---> 13th req would need negative duration if this was > cumulative > 15 6.7315 > 16 7.4873 > 17 0.8458---> same case with 17th request > 23 6.1076 > At the same time bean > solr/gettingstarted_shard1_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler > also showed strange values: > 65.13482 > 810.5694 > 90.504 > 10 0.344 > 12 8.8121 > 18 3.3531 > CC [~ab] -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10226) JMX metric avgTimePerRequest broken
[ https://issues.apache.org/jira/browse/SOLR-10226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897711#comment-15897711 ] Bojan Smid commented on SOLR-10226: --- Thanks for looking into this and patching it so quickly :). >From what I see, "totalTime" was removed in SOLR-8785. Having it back solves >my problem (actually, any monitoring solution would need such cumulative total >time). Re avgTimePerRequest - I agree with what you suggest, decayed value >makes much more sense (non-decayed would only be useful as a hack to get to >totalTime). > JMX metric avgTimePerRequest broken > --- > > Key: SOLR-10226 > URL: https://issues.apache.org/jira/browse/SOLR-10226 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Affects Versions: 6.4.1 >Reporter: Bojan Smid >Assignee: Andrzej Bialecki > Attachments: SOLR-10226.patch > > > JMX Metric avgTimePerRequest (of > org.apache.solr.handler.component.SearchHandler) doesn't appear to behave > correctly anymore. It was a cumulative value in pre-6.4 versions. Since > totalTime metric was removed (which was a base for monitoring calculations), > avgTimePerRequest seems like possible alternative to calculate "time spent in > requests since last measurement", but it behaves strangely after 6.4. > I did a simple test on gettingstarted collection (just unpacked the Solr > 6.4.1 version and started it with "bin/solr start -e cloud -noprompt"). The > query I used was: > http://localhost:8983/solr/gettingstarted/select?indent=on=*:*=json > I run it 30 times in a row (with approx 1 sec between executions). > At the same time I was looking (with jconsole) at bean > solr/gettingstarted_shard2_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler > Here is how metric was changing over time (first number is "requests" metric, > second number is "avgTimePerRequest"): > 10 6.6033 > 12 5.9557 > 13 0.9015---> 13th req would need negative duration if this was > cumulative > 15 6.7315 > 16 7.4873 > 17 0.8458---> same case with 17th request > 23 6.1076 > At the same time bean > solr/gettingstarted_shard1_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler > also showed strange values: > 65.13482 > 810.5694 > 90.504 > 10 0.344 > 12 8.8121 > 18 3.3531 > CC [~ab] -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10226) JMX metric avgTimePerRequest broken
[ https://issues.apache.org/jira/browse/SOLR-10226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15895573#comment-15895573 ] Bojan Smid commented on SOLR-10226: --- I think avgTimePerRequest in previous versions didn't have decay/sampling applied on it. I am looking at one Solr 6.3 node which was up for the past 2 months (I checked few other setups, different Solr version, but saw the same behavior). Here are the stats from its standard handler: requests:1791464 totalTime:564718.746333 avgTimePerRequest:0.3152275157820643 Both requests and totalTime metrics are cumulative and avgTimePerRequest shows exactly value totalTime/requests, therefore there was no decay/sampling applied in calculation of avgTime before 6.4. When it comes to previously posted sample, there was something like 30-60 sec or so between the requests (the time I needed to write down the numbers). I just did another test, fresh values (this time just 3-5 sec between the requests): 1 85.3 2 41.2 3 26.1 4 17.0 6 11.08 7 7.43 8 4.98 9 3.62 11 3.28 (few min pause) 13 8.12 14 3.33 (few min pause) 15 9.69 16 4.09 Does decay/sampling explain the behavior even with these short periods between the requests (ranging from few sec to few min)? > JMX metric avgTimePerRequest broken > --- > > Key: SOLR-10226 > URL: https://issues.apache.org/jira/browse/SOLR-10226 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Affects Versions: 6.4.1 >Reporter: Bojan Smid > > JMX Metric avgTimePerRequest (of > org.apache.solr.handler.component.SearchHandler) doesn't appear to behave > correctly anymore. It was a cumulative value in pre-6.4 versions. Since > totalTime metric was removed (which was a base for monitoring calculations), > avgTimePerRequest seems like possible alternative to calculate "time spent in > requests since last measurement", but it behaves strangely after 6.4. > I did a simple test on gettingstarted collection (just unpacked the Solr > 6.4.1 version and started it with "bin/solr start -e cloud -noprompt"). The > query I used was: > http://localhost:8983/solr/gettingstarted/select?indent=on=*:*=json > I run it 30 times in a row (with approx 1 sec between executions). > At the same time I was looking (with jconsole) at bean > solr/gettingstarted_shard2_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler > Here is how metric was changing over time (first number is "requests" metric, > second number is "avgTimePerRequest"): > 10 6.6033 > 12 5.9557 > 13 0.9015---> 13th req would need negative duration if this was > cumulative > 15 6.7315 > 16 7.4873 > 17 0.8458---> same case with 17th request > 23 6.1076 > At the same time bean > solr/gettingstarted_shard1_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler > also showed strange values: > 65.13482 > 810.5694 > 90.504 > 10 0.344 > 12 8.8121 > 18 3.3531 > CC [~ab] -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-10226) JMX metric avgTimePerRequest broken
Bojan Smid created SOLR-10226: - Summary: JMX metric avgTimePerRequest broken Key: SOLR-10226 URL: https://issues.apache.org/jira/browse/SOLR-10226 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: metrics Affects Versions: 6.4.1 Reporter: Bojan Smid JMX Metric avgTimePerRequest (of org.apache.solr.handler.component.SearchHandler) doesn't appear to behave correctly anymore. It was a cumulative value in pre-6.4 versions. Since totalTime metric was removed (which was a base for monitoring calculations), avgTimePerRequest seems like possible alternative to calculate "time spent in requests since last measurement", but it behaves strangely after 6.4. I did a simple test on gettingstarted collection (just unpacked the Solr 6.4.1 version and started it with "bin/solr start -e cloud -noprompt"). The query I used was: http://localhost:8983/solr/gettingstarted/select?indent=on=*:*=json I run it 30 times in a row (with approx 1 sec between executions). At the same time I was looking (with jconsole) at bean solr/gettingstarted_shard2_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler Here is how metric was changing over time (first number is "requests" metric, second number is "avgTimePerRequest"): 10 6.6033 12 5.9557 13 0.9015---> 13th req would need negative duration if this was cumulative 15 6.7315 16 7.4873 17 0.8458---> same case with 17th request 23 6.1076 At the same time bean solr/gettingstarted_shard1_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler also showed strange values: 65.13482 810.5694 90.504 10 0.344 12 8.8121 18 3.3531 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5691) Unsynchronized WeakHashMap in SolrDispatchFilter causing issues in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13893200#comment-13893200 ] Bojan Smid commented on SOLR-5691: -- Thanks for fixing! Unsynchronized WeakHashMap in SolrDispatchFilter causing issues in SolrCloud Key: SOLR-5691 URL: https://issues.apache.org/jira/browse/SOLR-5691 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Reporter: Bojan Smid Assignee: Mark Miller Fix For: 5.0, 4.7 I have a large SolrCloud setup, 7 nodes, each hosting few 1000 cores (leaders/replicas of same shard exist on different nodes), which is maybe making it easier to notice the problem. Node can randomly get into a state where it stops responding to PeerSync /get requests from other nodes. When that happens, threaddump of that node shows multiple entries like this one (one entry for each blocked request from other node; they don't go away with time): http-bio-8080-exec-1781 daemon prio=5 tid=0x44017720 nid=0x25ae [ JVM locked by VM at safepoint, polling bits: safep ] java.lang.Thread.State: RUNNABLE at java.util.WeakHashMap.get(WeakHashMap.java:471) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) WeakHashMap's internal state can easily get corrupted when used in unsynchronized way, in which case it is known to enter infinite loop in .get() call. It is very likely that this happens here too. The reason why other maybe don't see this issue could be related to huge number of cores I have in this system. The problem is usually created when some node is starting. Also, it doesn't happen with each start, it obviously depends on correct timing of events which lead to map's corruption. The fix may be as simple as changing: protected final MapSolrConfig, SolrRequestParsers parsers = new WeakHashMapSolrConfig, SolrRequestParsers(); to: protected final MapSolrConfig, SolrRequestParsers parsers = Collections.synchronizedMap( new WeakHashMapSolrConfig, SolrRequestParsers()); but there may be performance considerations around this since it is entrance into Solr. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5691) Unsynchronized WeakHashMap in SolrDispatchFilter causing issues in SolrCloud
Bojan Smid created SOLR-5691: Summary: Unsynchronized WeakHashMap in SolrDispatchFilter causing issues in SolrCloud Key: SOLR-5691 URL: https://issues.apache.org/jira/browse/SOLR-5691 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Reporter: Bojan Smid I have a large SolrCloud setup, 7 nodes, each hosting few 1000 cores (leaders/replicas of same shard exist on different nodes), which is maybe making it easier to notice the problem. Node can randomly get into a state where it stops responding to PeerSync /get requests from other nodes. When that happens, threaddump of that node shows multiple entries like this one (one entry for each blocked request from other node; they don't go away with time): http-bio-8080-exec-1781 daemon prio=5 tid=0x44017720 nid=0x25ae [ JVM locked by VM at safepoint, polling bits: safep ] java.lang.Thread.State: RUNNABLE at java.util.WeakHashMap.get(WeakHashMap.java:471) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) WeakHashMap's internal state can easily get corrupted when used in unsynchronized way, in which case it is known to enter infinite loop in .get() call. It is very likely that this happens here too. The reason why other maybe don't see this issue could be related to huge number of cores I have in this system. The problem is usually created when some node is starting. Also, it doesn't happen with each start, it obviously depends on correct timing of events which lead to map's corruption. The fix may be as simple as changing: protected final MapSolrConfig, SolrRequestParsers parsers = new WeakHashMapSolrConfig, SolrRequestParsers(); to: protected final MapSolrConfig, SolrRequestParsers parsers = Collections.synchronizedMap( new WeakHashMapSolrConfig, SolrRequestParsers()); but there may be performance considerations around this since it is entrance into Solr. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5692) StackOverflowError during SolrCloud leader election process
Bojan Smid created SOLR-5692: Summary: StackOverflowError during SolrCloud leader election process Key: SOLR-5692 URL: https://issues.apache.org/jira/browse/SOLR-5692 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Reporter: Bojan Smid I have SolrCloud cluster with 7 nodes, each with few 1000 cores. I got this StackOverflow few times when starting one of the nodes (just a piece of stack trace, the rest repeats, leader election process obviously got stuck in infinite repetition of steps): [2/4/14 3:42:43 PM] Bojan: 2014-02-04 15:18:01,947 [localhost-startStop-1-EventThread] ERROR org.apache.zookeeper.ClientCnxn- Error while calling watcher java.lang.StackOverflowError at java.security.AccessController.doPrivileged(Native Method) at java.io.PrintWriter.init(PrintWriter.java:116) at java.io.PrintWriter.init(PrintWriter.java:100) at org.apache.solr.common.SolrException.toStr(SolrException.java:138) at org.apache.solr.common.SolrException.log(SolrException.java:113) [2/4/14 3:42:58 PM] Bojan: at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:377) at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272) at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380) at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272) at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380) at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272) at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380) at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272) at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380) at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272) -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2272) Join
[ https://issues.apache.org/jira/browse/SOLR-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994762#comment-12994762 ] Bojan Smid commented on SOLR-2272: -- Very nice patch Yonik. However, it doesn't apply on current trunk any more. Does anyone, by any chance, have a fresh version of this patch? Join Key: SOLR-2272 URL: https://issues.apache.org/jira/browse/SOLR-2272 Project: Solr Issue Type: New Feature Components: search Reporter: Yonik Seeley Fix For: 4.0 Attachments: SOLR-2272.patch Limited join functionality for Solr, mapping one set of IDs matching a query to another set of IDs, based on the indexed tokens of the fields. Example: fq={!join from=parent_ptr to:parent_id}child_doc:query -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2272) Join
[ https://issues.apache.org/jira/browse/SOLR-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994984#comment-12994984 ] Bojan Smid commented on SOLR-2272: -- Great, thx a lot Yonik :). Join Key: SOLR-2272 URL: https://issues.apache.org/jira/browse/SOLR-2272 Project: Solr Issue Type: New Feature Components: search Reporter: Yonik Seeley Fix For: 4.0 Attachments: SOLR-2272.patch, SOLR-2272.patch Limited join functionality for Solr, mapping one set of IDs matching a query to another set of IDs, based on the indexed tokens of the fields. Example: fq={!join from=parent_ptr to:parent_id}child_doc:query -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12606986#action_12606986 ] Bojan Smid commented on SOLR-236: - You can check discussion about this same problem in the posts above (starting with 1st Feb 2008). It seems like a rather complex issue which could require some serious refactoring of collapsing code. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Otis Gospodnetic Attachments: field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, solr-236.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bojan Smid updated SOLR-236: Attachment: solr-236.patch I updated the patch so that it can be compiled on Solr trunk. Also, since CollapseComponent essentially copied QueryComponent's prepare method (and it seems that it is supposed to be used instead of it), I made it extend QueryComponent (with collapsing-specific process() method, and prepare() method inherited from super class). Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Otis Gospodnetic Attachments: field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, solr-236.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12602651#action_12602651 ] Bojan Smid commented on SOLR-572: - File based spell checker would probably be used in cases when Solr index is too small or too young. So a user would compile a dictionary file (for instance, UNIX words file) and use it as a dictionary. Spell Checker as a Search Component --- Key: SOLR-572 URL: https://issues.apache.org/jira/browse/SOLR-572 Project: Solr Issue Type: New Feature Components: spellchecker Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Assignee: Grant Ingersoll Priority: Minor Fix For: 1.3 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features: * Allow creating a spell index on a given field and make it possible to have multiple spell indices -- one for each field * Give suggestions on a per-field basis * Given a multi-word query, give only one consistent suggestion * Process the query with the same analyzer specified for the source field and process each token separately * Allow the user to specify minimum length for a token (optional) Consistency criteria for a multi-word query can consist of the following: * Preserve the correct words in the original query as it is * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12599660#action_12599660 ] Bojan Smid commented on SOLR-236: - I will try to bring this patch up to date. Currently I see two main problems: 1) The patch applies to trunk, but it doesn't compile. The problem occurs mainly because of changes in Search Components (for instance, some method signatures which CollapseComponent implements were changed). I have this fixed locally (more or less), but I have to test it before posting new version of patch. 2) It seems that CollapseComponent can't be used in chain with QueryComponent, but instead of it. CollapseComponent basically copies QueryComponent querying logic and adds some of it's own. I guess this isn't the right way to go. CollapseComponent should contain only collapsing logic and should be chainable with other components. Can anyone confirm if I'm right here? Of course, there might be some fundamental reason why CollapseComponent had to be implemented this way. Does anyone else see any other issues with this component? Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Otis Gospodnetic Attachments: field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12599714#action_12599714 ] Bojan Smid commented on SOLR-236: - Hi Oleg. I'll look into this also. In case you have any working code, you can mail it to me, and I'll see what can be reused. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Otis Gospodnetic Attachments: field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12598728#action_12598728 ] Bojan Smid commented on SOLR-572: - I already found the same problem, made a fix and sent it to Shalin, he will incorporate it into next patch when it's ready. If you specify field field type for that dictionary (and that field type can be found in Solr schema), you'll avoid the problem for now. Spell Checker as a Search Component --- Key: SOLR-572 URL: https://issues.apache.org/jira/browse/SOLR-572 Project: Solr Issue Type: New Feature Components: spellchecker Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Fix For: 1.3 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features: * Allow creating a spell index on a given field and make it possible to have multiple spell indices -- one for each field * Give suggestions on a per-field basis * Given a multi-word query, give only one consistent suggestion * Process the query with the same analyzer specified for the source field and process each token separately * Allow the user to specify minimum length for a token (optional) Consistency criteria for a multi-word query can consist of the following: * Preserve the correct words in the original query as it is * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12598738#action_12598738 ] Bojan Smid commented on SOLR-572: - Oleg, that field is now called fieldType, so something like str name=fieldTypeword/str should work for you as long as you have fileType with name word defined in your schema.xml. Let me know if this works. Spell Checker as a Search Component --- Key: SOLR-572 URL: https://issues.apache.org/jira/browse/SOLR-572 Project: Solr Issue Type: New Feature Components: spellchecker Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Fix For: 1.3 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features: * Allow creating a spell index on a given field and make it possible to have multiple spell indices -- one for each field * Give suggestions on a per-field basis * Given a multi-word query, give only one consistent suggestion * Process the query with the same analyzer specified for the source field and process each token separately * Allow the user to specify minimum length for a token (optional) Consistency criteria for a multi-word query can consist of the following: * Preserve the correct words in the original query as it is * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12598752#action_12598752 ] Bojan Smid commented on SOLR-572: - I noticed that when searching for suggestion for a word which exists in dictionary, SC returns some similar word instead of returning that same word. Old SCRH had field exist which returned true if word exists in the dictionary (so the client can treat it as correct word that doesn't need suggestion). We can't have exactly the same functionality here (since multi-word queries should be supported), but we can make SC return field spellingCorrect in case all words from the query exist in the dictionary. Otherwise, there is no way to know if spelling was correct or we should display suggestion. There is a method in Lucene's SC to check if word exists in the index, so it's easy to check if word is correct. However, I'm also thinking of situation when we don't have just simple words in the query, for instance : toyata AND miles:[1 to 1], we want to check just toyata in the index, and return suggestion toyota AND miles:[1 to 1]. Other query types which might pose a problem are: - fuzzy query - wildcard query - prefix query ... Spell Checker as a Search Component --- Key: SOLR-572 URL: https://issues.apache.org/jira/browse/SOLR-572 Project: Solr Issue Type: New Feature Components: spellchecker Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Fix For: 1.3 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features: * Allow creating a spell index on a given field and make it possible to have multiple spell indices -- one for each field * Give suggestions on a per-field basis * Given a multi-word query, give only one consistent suggestion * Process the query with the same analyzer specified for the source field and process each token separately * Allow the user to specify minimum length for a token (optional) Consistency criteria for a multi-word query can consist of the following: * Preserve the correct words in the original query as it is * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12598835#action_12598835 ] Bojan Smid commented on SOLR-572: - Sure, a quick fix can be done easily, but it probably wouldn't cover all possibilities, hence my post... Spell Checker as a Search Component --- Key: SOLR-572 URL: https://issues.apache.org/jira/browse/SOLR-572 Project: Solr Issue Type: New Feature Components: spellchecker Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Fix For: 1.3 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features: * Allow creating a spell index on a given field and make it possible to have multiple spell indices -- one for each field * Give suggestions on a per-field basis * Given a multi-word query, give only one consistent suggestion * Process the query with the same analyzer specified for the source field and process each token separately * Allow the user to specify minimum length for a token (optional) Consistency criteria for a multi-word query can consist of the following: * Preserve the correct words in the original query as it is * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12597913#action_12597913 ] Bojan Smid commented on SOLR-572: - I would like to add support for different character encodings in file-based dictionaries (current implementation will take system's default settings). I'm not sure how we'll synchronize your work with my fix? Can you let me know when/how can I start my work? Spell Checker as a Search Component --- Key: SOLR-572 URL: https://issues.apache.org/jira/browse/SOLR-572 Project: Solr Issue Type: New Feature Components: spellchecker Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Fix For: 1.3 Attachments: SOLR-572.patch, SOLR-572.patch Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features: * Allow creating a spell index on a given field and make it possible to have multiple spell indices -- one for each field * Give suggestions on a per-field basis * Given a multi-word query, give only one consistent suggestion * Process the query with the same analyzer specified for the source field and process each token separately * Allow the user to specify minimum length for a token (optional) Consistency criteria for a multi-word query can consist of the following: * Preserve the correct words in the original query as it is * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bojan Smid updated SOLR-572: Attachment: SOLR-572.patch Character encodings for file-based dictionaries now supported with property characterEncoding. So, configuration for such dictionary would look like this: {code:xml} lst name=dictionary str name=nameexternal/str str name=typefile/str str name=locationspellings.txt/str str name=characterEncodingUTF-8/str str name=spellcheckIndexDir c:\spellchecker/str /lst {code} New code needs latest lucene-spellchecker-2.4*.jar from Lucene trunk. Since SolrResourceLoader.getLines method doesn't support configurable encodings (treats everything as UTF-8), I wasn't sure how to add that support. I could have added overloaded method to SolrResourceLoader, but there is a TODO comment, so I decided to create getLines() method inside SpellCheckComponent class instead. What do you think of this? Spell Checker as a Search Component --- Key: SOLR-572 URL: https://issues.apache.org/jira/browse/SOLR-572 Project: Solr Issue Type: New Feature Components: spellchecker Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Fix For: 1.3 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features: * Allow creating a spell index on a given field and make it possible to have multiple spell indices -- one for each field * Give suggestions on a per-field basis * Given a multi-word query, give only one consistent suggestion * Process the query with the same analyzer specified for the source field and process each token separately * Allow the user to specify minimum length for a token (optional) Consistency criteria for a multi-word query can consist of the following: * Preserve the correct words in the original query as it is * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12597930#action_12597930 ] bosmid edited comment on SOLR-572 at 5/19/08 5:03 AM: -- Character encodings for file-based dictionaries now supported with property characterEncoding. So, configuration for such dictionary would look like this: {code:xml} lst name=dictionary str name=nameexternal/str str name=typefile/str str name=locationspellings.txt/str str name=characterEncodingUTF-8/str str name=spellcheckIndexDir c:\spellchecker/str /lst {code} New code needs latest lucene-spellchecker-2.4*.jar from Lucene trunk. Since SolrResourceLoader.getLines method doesn't support configurable encodings (treats everything as UTF-8), I wasn't sure how to add that support. I could have added overloaded method to SolrResourceLoader, but there is a TODO comment, so I decided to create getLines() method inside SpellCheckComponent class instead. What do you think of this? was (Author: bosmid): Character encodings for file-based dictionaries now supported with property characterEncoding. So, configuration for such dictionary would look like this: {code:xml} lst name=dictionary str name=nameexternal/str str name=typefile/str str name=locationspellings.txt/str str name=characterEncodingUTF-8/str str name=spellcheckIndexDir c:\spellchecker/str /lst {code} New code needs latest lucene-spellchecker-2.4*.jar from Lucene trunk. Since SolrResourceLoader.getLines method doesn't support configurable encodings (treats everything as UTF-8), I wasn't sure how to add that support. I could have added overloaded method to SolrResourceLoader, but there is a TODO comment, so I decided to create getLines() method inside SpellCheckComponent class instead. What do you think of this? Spell Checker as a Search Component --- Key: SOLR-572 URL: https://issues.apache.org/jira/browse/SOLR-572 Project: Solr Issue Type: New Feature Components: spellchecker Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Fix For: 1.3 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features: * Allow creating a spell index on a given field and make it possible to have multiple spell indices -- one for each field * Give suggestions on a per-field basis * Given a multi-word query, give only one consistent suggestion * Process the query with the same analyzer specified for the source field and process each token separately * Allow the user to specify minimum length for a token (optional) Consistency criteria for a multi-word query can consist of the following: * Preserve the correct words in the original query as it is * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12597930#action_12597930 ] bosmid edited comment on SOLR-572 at 5/19/08 5:05 AM: -- Character encodings for file-based dictionaries now supported with property characterEncoding. So, configuration for such dictionary would look like this: {code:xml} lst name=dictionary str name=nameexternal/str str name=typefile/str str name=sourceLocationspellings.txt/str str name=characterEncodingUTF-8/str str name=spellcheckIndexDir c:\spellchecker/str /lst {code} New code needs latest lucene-spellchecker-2.4*.jar from Lucene trunk. Since SolrResourceLoader.getLines method doesn't support configurable encodings (treats everything as UTF-8), I wasn't sure how to add that support. I could have added overloaded method to SolrResourceLoader, but there is a TODO comment, so I decided to create getLines() method inside SpellCheckComponent class instead. What do you think of this? was (Author: bosmid): Character encodings for file-based dictionaries now supported with property characterEncoding. So, configuration for such dictionary would look like this: {code:xml} lst name=dictionary str name=nameexternal/str str name=typefile/str str name=locationspellings.txt/str str name=characterEncodingUTF-8/str str name=spellcheckIndexDir c:\spellchecker/str /lst {code} New code needs latest lucene-spellchecker-2.4*.jar from Lucene trunk. Since SolrResourceLoader.getLines method doesn't support configurable encodings (treats everything as UTF-8), I wasn't sure how to add that support. I could have added overloaded method to SolrResourceLoader, but there is a TODO comment, so I decided to create getLines() method inside SpellCheckComponent class instead. What do you think of this? Spell Checker as a Search Component --- Key: SOLR-572 URL: https://issues.apache.org/jira/browse/SOLR-572 Project: Solr Issue Type: New Feature Components: spellchecker Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Fix For: 1.3 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features: * Allow creating a spell index on a given field and make it possible to have multiple spell indices -- one for each field * Give suggestions on a per-field basis * Given a multi-word query, give only one consistent suggestion * Process the query with the same analyzer specified for the source field and process each token separately * Allow the user to specify minimum length for a token (optional) Consistency criteria for a multi-word query can consist of the following: * Preserve the correct words in the original query as it is * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12597466#action_12597466 ] Bojan Smid commented on SOLR-572: - The field attribute for file-based dictionary is basically the same field attribute as in default dictionary (in both cases they are used to obtain query analyzer), so that is the reason why I used the same name. My question was is it ok for default dictionary to use the same field to build dictionary from solr index and to obtain query analyzer for extracting tokens? Spell Checker as a Search Component --- Key: SOLR-572 URL: https://issues.apache.org/jira/browse/SOLR-572 Project: Solr Issue Type: New Feature Components: spellchecker Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Fix For: 1.3 Attachments: SOLR-572.patch, SOLR-572.patch Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features: * Allow creating a spell index on a given field and make it possible to have multiple spell indices -- one for each field * Give suggestions on a per-field basis * Given a multi-word query, give only one consistent suggestion * Process the query with the same analyzer specified for the source field and process each token separately * Allow the user to specify minimum length for a token (optional) Consistency criteria for a multi-word query can consist of the following: * Preserve the correct words in the original query as it is * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-553) Highlighter does not match phrase queries correctly
[ https://issues.apache.org/jira/browse/SOLR-553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bojan Smid updated SOLR-553: Attachment: Solr-553.patch Added unit test for this fix to the patch. Highlighter does not match phrase queries correctly --- Key: SOLR-553 URL: https://issues.apache.org/jira/browse/SOLR-553 Project: Solr Issue Type: New Feature Components: highlighter Affects Versions: 1.2 Environment: all Reporter: Brian Whitman Assignee: Otis Gospodnetic Attachments: highlighttest.xml, Solr-553.patch, Solr-553.patch http://www.nabble.com/highlighting-pt2%3A-returning-tokens-out-of-order-from-PhraseQuery-to16156718.html Say we search for the band I Love You But I've Chosen Darkness .../selectrows=100q=%22I%20Love%20You%20But%20I\'ve%20Chosen%20Darkness%22fq=type:htmlhl=truehl.fl=contenthl.fragsize=500hl.snippets=5hl.simple.pre=%3Cspan%3Ehl.simple.post=%3C/span%3E The highlight returns a snippet that does have the name altogether: Lights (Live) : spanI/span spanLove/span spanYou/span But spanI've/span spanChosen/span spanDarkness/span : But also returns unrelated snips from the same page: Black Francis Shop spanI/span Think spanI/span spanLove/span spanYou/span A correct highlighter should not return snippets that do not match the phrase exactly. LUCENE-794 (not yet committed, but seems to be ready) fixes up the problem from the Lucene end. Solr should get it too. Related: SOLR-575 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.