[jira] [Created] (SOLR-12765) Possibly incorrect format in JMX cache stats

2018-09-12 Thread Bojan Smid (JIRA)
Bojan Smid created SOLR-12765:
-

 Summary: Possibly incorrect format in JMX cache stats
 Key: SOLR-12765
 URL: https://issues.apache.org/jira/browse/SOLR-12765
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 7.4
Reporter: Bojan Smid


I posted a question on ML 
[https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3CCAGniRXR4Ps%3D03X0uiByCn5ecUT2VY4TLV4iNcxCde3dxBnmC-w%40mail.gmail.com%3E|https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3CCAGniRXR4Ps%3D03X0uiByCn5ecUT2VY4TLV4iNcxCde3dxBnmC-w%40mail.gmail.com%3E),]
 , but didn't get feedback. Since it looks like a possible bug, I am opening a 
ticket.

 
  It seems the format of cache mbeans changed with 7.4.0.  And from what I see 
similar change wasn't made for other mbeans, which may mean it was accidental 
and may be a bug.
 
  In Solr 7.3.* format was (each attribute on its own, numeric type):
 
mbean:
solr:dom1=core,dom2=gettingstarted,dom3=shard1,dom4=replica_n1,category=CACHE,scope=searcher,name=filterCache
 
attributes:
  lookups java.lang.Long = 0
  hits java.lang.Long = 0
  cumulative_evictions java.lang.Long = 0
  size java.lang.Long = 0
  hitratio java.lang.Float = 0.0
  evictions java.lang.Long = 0
  cumulative_lookups java.lang.Long = 0
  cumulative_hitratio java.lang.Float = 0.0
  warmupTime java.lang.Long = 0
  inserts java.lang.Long = 0
  cumulative_inserts java.lang.Long = 0
  cumulative_hits java.lang.Long = 0


 
  With 7.4.0 there is a single attribute "Value" (java.lang.Object):
 
mbean:
solr:dom1=core,dom2=gettingstarted,dom3=shard1,dom4=replica_n1,category=CACHE,scope=searcher,name=filterCache
 
attributes:
  Value java.lang.Object = \{lookups=0, evictions=0, cumulative_inserts=0, 
cumulative_hits=0, hits=0, cumulative_evictions=0, size=0, hitratio=0.0, 
cumulative_lookups=0, cumulative_hitratio=0.0, warmupTime=0, inserts=0}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11323) Expose cache maxSize and autowarm settings in JMX

2017-09-05 Thread Bojan Smid (JIRA)
Bojan Smid created SOLR-11323:
-

 Summary: Expose cache maxSize and autowarm settings in JMX
 Key: SOLR-11323
 URL: https://issues.apache.org/jira/browse/SOLR-11323
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 7.0, 7.1
Reporter: Bojan Smid


Before Solr 7.*, cache maxSize and autowarm settings were exposed in JMX along 
with cache metrics. There was a textual attribute "description" which could be 
parsed to extract maxSize and autowarm settings. This was very useful for 
various monitoring tools since maxSize and autowarm could then be displayed on 
monitoring charts (one could for example compare current size of some cache to 
its maxSize without digging through configs to find this setting).

Ideally maxSize and autowarm count/% would be exposed as two separate 
attributes, but having a single description field (which can be parsed) would 
also be better than nothing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10226) JMX metric avgTimePerRequest broken

2017-03-06 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897818#comment-15897818
 ] 

Bojan Smid commented on SOLR-10226:
---

I tested the patch quickly, metric totalTime is now there, but there is one 
small problem - it is expressed in ns. To be backward compatible it should be 
in ms.

> JMX metric avgTimePerRequest broken
> ---
>
> Key: SOLR-10226
> URL: https://issues.apache.org/jira/browse/SOLR-10226
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
>Reporter: Bojan Smid
>Assignee: Andrzej Bialecki 
> Attachments: SOLR-10226.patch
>
>
> JMX Metric avgTimePerRequest (of 
> org.apache.solr.handler.component.SearchHandler) doesn't appear to behave 
> correctly anymore. It was a cumulative value in pre-6.4 versions. Since 
> totalTime metric was removed (which was a base for monitoring calculations), 
> avgTimePerRequest seems like possible alternative to calculate "time spent in 
> requests since last measurement", but it behaves strangely after 6.4.
> I did a simple test on gettingstarted collection (just unpacked the Solr 
> 6.4.1 version and started it with "bin/solr start -e cloud -noprompt"). The 
> query I used was:
> http://localhost:8983/solr/gettingstarted/select?indent=on=*:*=json
> I run it 30 times in a row (with approx 1 sec between executions).
> At the same time I was looking (with jconsole) at bean 
> solr/gettingstarted_shard2_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler
> Here is how metric was changing over time (first number is "requests" metric, 
> second number is "avgTimePerRequest"):
> 10   6.6033
> 12   5.9557
> 13   0.9015---> 13th req would need negative duration if this was 
> cumulative
> 15   6.7315
> 16   7.4873
> 17   0.8458---> same case with 17th request
> 23   6.1076
> At the same time bean 
> solr/gettingstarted_shard1_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler
>   also showed strange values:
> 65.13482
> 810.5694
> 90.504
> 10  0.344
> 12  8.8121
> 18  3.3531
> CC [~ab]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10226) JMX metric avgTimePerRequest broken

2017-03-06 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897711#comment-15897711
 ] 

Bojan Smid commented on SOLR-10226:
---

Thanks for looking into this and patching it so quickly :).

>From what I see, "totalTime" was removed in SOLR-8785. Having it back solves 
>my problem (actually, any monitoring solution would need such cumulative total 
>time). Re avgTimePerRequest - I agree with what you suggest, decayed value 
>makes much more sense (non-decayed would only be useful as a hack to get to 
>totalTime).



> JMX metric avgTimePerRequest broken
> ---
>
> Key: SOLR-10226
> URL: https://issues.apache.org/jira/browse/SOLR-10226
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
>Reporter: Bojan Smid
>Assignee: Andrzej Bialecki 
> Attachments: SOLR-10226.patch
>
>
> JMX Metric avgTimePerRequest (of 
> org.apache.solr.handler.component.SearchHandler) doesn't appear to behave 
> correctly anymore. It was a cumulative value in pre-6.4 versions. Since 
> totalTime metric was removed (which was a base for monitoring calculations), 
> avgTimePerRequest seems like possible alternative to calculate "time spent in 
> requests since last measurement", but it behaves strangely after 6.4.
> I did a simple test on gettingstarted collection (just unpacked the Solr 
> 6.4.1 version and started it with "bin/solr start -e cloud -noprompt"). The 
> query I used was:
> http://localhost:8983/solr/gettingstarted/select?indent=on=*:*=json
> I run it 30 times in a row (with approx 1 sec between executions).
> At the same time I was looking (with jconsole) at bean 
> solr/gettingstarted_shard2_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler
> Here is how metric was changing over time (first number is "requests" metric, 
> second number is "avgTimePerRequest"):
> 10   6.6033
> 12   5.9557
> 13   0.9015---> 13th req would need negative duration if this was 
> cumulative
> 15   6.7315
> 16   7.4873
> 17   0.8458---> same case with 17th request
> 23   6.1076
> At the same time bean 
> solr/gettingstarted_shard1_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler
>   also showed strange values:
> 65.13482
> 810.5694
> 90.504
> 10  0.344
> 12  8.8121
> 18  3.3531
> CC [~ab]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10226) JMX metric avgTimePerRequest broken

2017-03-03 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15895573#comment-15895573
 ] 

Bojan Smid commented on SOLR-10226:
---

I think avgTimePerRequest in previous versions didn't have decay/sampling 
applied on it. I am looking at one Solr 6.3 node which was up for the past 2 
months (I checked few other setups, different Solr version, but saw the same 
behavior). Here are the stats from its standard handler:
requests:1791464
totalTime:564718.746333
avgTimePerRequest:0.3152275157820643

Both requests and totalTime metrics are cumulative and avgTimePerRequest shows 
exactly value totalTime/requests, therefore there was no decay/sampling applied 
in calculation of avgTime before 6.4.


When it comes to previously posted sample, there was something like 30-60 sec 
or so between the requests (the time I needed to write down the numbers). I 
just did another test, fresh values (this time just 3-5 sec between the 
requests):
1  85.3
2  41.2
3  26.1
4 17.0
6 11.08
7 7.43
8 4.98
9 3.62
11 3.28
(few min pause)
13 8.12
14 3.33
(few min pause)
15 9.69
16 4.09

Does decay/sampling explain the behavior even with these short periods between 
the requests (ranging from few sec to few min)?

> JMX metric avgTimePerRequest broken
> ---
>
> Key: SOLR-10226
> URL: https://issues.apache.org/jira/browse/SOLR-10226
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
>Reporter: Bojan Smid
>
> JMX Metric avgTimePerRequest (of 
> org.apache.solr.handler.component.SearchHandler) doesn't appear to behave 
> correctly anymore. It was a cumulative value in pre-6.4 versions. Since 
> totalTime metric was removed (which was a base for monitoring calculations), 
> avgTimePerRequest seems like possible alternative to calculate "time spent in 
> requests since last measurement", but it behaves strangely after 6.4.
> I did a simple test on gettingstarted collection (just unpacked the Solr 
> 6.4.1 version and started it with "bin/solr start -e cloud -noprompt"). The 
> query I used was:
> http://localhost:8983/solr/gettingstarted/select?indent=on=*:*=json
> I run it 30 times in a row (with approx 1 sec between executions).
> At the same time I was looking (with jconsole) at bean 
> solr/gettingstarted_shard2_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler
> Here is how metric was changing over time (first number is "requests" metric, 
> second number is "avgTimePerRequest"):
> 10   6.6033
> 12   5.9557
> 13   0.9015---> 13th req would need negative duration if this was 
> cumulative
> 15   6.7315
> 16   7.4873
> 17   0.8458---> same case with 17th request
> 23   6.1076
> At the same time bean 
> solr/gettingstarted_shard1_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler
>   also showed strange values:
> 65.13482
> 810.5694
> 90.504
> 10  0.344
> 12  8.8121
> 18  3.3531
> CC [~ab]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-10226) JMX metric avgTimePerRequest broken

2017-03-03 Thread Bojan Smid (JIRA)
Bojan Smid created SOLR-10226:
-

 Summary: JMX metric avgTimePerRequest broken
 Key: SOLR-10226
 URL: https://issues.apache.org/jira/browse/SOLR-10226
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: metrics
Affects Versions: 6.4.1
Reporter: Bojan Smid


JMX Metric avgTimePerRequest (of 
org.apache.solr.handler.component.SearchHandler) doesn't appear to behave 
correctly anymore. It was a cumulative value in pre-6.4 versions. Since 
totalTime metric was removed (which was a base for monitoring calculations), 
avgTimePerRequest seems like possible alternative to calculate "time spent in 
requests since last measurement", but it behaves strangely after 6.4.

I did a simple test on gettingstarted collection (just unpacked the Solr 6.4.1 
version and started it with "bin/solr start -e cloud -noprompt"). The query I 
used was:
http://localhost:8983/solr/gettingstarted/select?indent=on=*:*=json
I run it 30 times in a row (with approx 1 sec between executions).

At the same time I was looking (with jconsole) at bean 
solr/gettingstarted_shard2_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler

Here is how metric was changing over time (first number is "requests" metric, 
second number is "avgTimePerRequest"):
10   6.6033
12   5.9557
13   0.9015---> 13th req would need negative duration if this was cumulative
15   6.7315
16   7.4873
17   0.8458---> same case with 17th request
23   6.1076

At the same time bean 
solr/gettingstarted_shard1_replica2:type=/select,id=org.apache.solr.handler.component.SearchHandler
  also showed strange values:
65.13482
810.5694
90.504
10  0.344
12  8.8121
18  3.3531



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5691) Unsynchronized WeakHashMap in SolrDispatchFilter causing issues in SolrCloud

2014-02-06 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13893200#comment-13893200
 ] 

Bojan Smid commented on SOLR-5691:
--

Thanks for fixing!

 Unsynchronized WeakHashMap in SolrDispatchFilter causing issues in SolrCloud
 

 Key: SOLR-5691
 URL: https://issues.apache.org/jira/browse/SOLR-5691
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6.1
Reporter: Bojan Smid
Assignee: Mark Miller
 Fix For: 5.0, 4.7


 I have a large SolrCloud setup, 7 nodes, each hosting few 1000 cores 
 (leaders/replicas of same shard exist on different nodes), which is maybe 
 making it easier to notice the problem.
 Node can randomly get into a state where it stops responding to PeerSync 
 /get requests from other nodes. When that happens, threaddump of that node 
 shows multiple entries like this one (one entry for each blocked request 
 from other node; they don't go away with time):
 http-bio-8080-exec-1781 daemon prio=5 tid=0x44017720 nid=0x25ae  [ JVM 
 locked by VM at safepoint, polling bits: safep ]
java.lang.Thread.State: RUNNABLE
 at java.util.WeakHashMap.get(WeakHashMap.java:471)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 WeakHashMap's internal state can easily get corrupted when used in 
 unsynchronized way, in which case it is known to enter infinite loop in 
 .get() call. It is very likely that this happens here too. The reason why 
 other maybe don't see this issue could be related to huge number of cores I 
 have in this system. The problem is usually created when some node is 
 starting. Also, it doesn't happen with each start, it obviously depends on 
 correct timing of events which lead to map's corruption.
 The fix may be as simple as changing:
 protected final MapSolrConfig, SolrRequestParsers parsers = new 
 WeakHashMapSolrConfig, SolrRequestParsers();
 to:
   protected final MapSolrConfig, SolrRequestParsers parsers = 
 Collections.synchronizedMap(
   new WeakHashMapSolrConfig, SolrRequestParsers());
 but there may be performance considerations around this since it is entrance 
 into Solr.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5691) Unsynchronized WeakHashMap in SolrDispatchFilter causing issues in SolrCloud

2014-02-04 Thread Bojan Smid (JIRA)
Bojan Smid created SOLR-5691:


 Summary: Unsynchronized WeakHashMap in SolrDispatchFilter causing 
issues in SolrCloud
 Key: SOLR-5691
 URL: https://issues.apache.org/jira/browse/SOLR-5691
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6.1
Reporter: Bojan Smid


I have a large SolrCloud setup, 7 nodes, each hosting few 1000 cores 
(leaders/replicas of same shard exist on different nodes), which is maybe 
making it easier to notice the problem.

Node can randomly get into a state where it stops responding to PeerSync /get 
requests from other nodes. When that happens, threaddump of that node shows 
multiple entries like this one (one entry for each blocked request from other 
node; they don't go away with time):

http-bio-8080-exec-1781 daemon prio=5 tid=0x44017720 nid=0x25ae  [ JVM 
locked by VM at safepoint, polling bits: safep ]
   java.lang.Thread.State: RUNNABLE
at java.util.WeakHashMap.get(WeakHashMap.java:471)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

WeakHashMap's internal state can easily get corrupted when used in 
unsynchronized way, in which case it is known to enter infinite loop in .get() 
call. It is very likely that this happens here too. The reason why other maybe 
don't see this issue could be related to huge number of cores I have in this 
system. The problem is usually created when some node is starting. Also, it 
doesn't happen with each start, it obviously depends on correct timing of 
events which lead to map's corruption.

The fix may be as simple as changing:

protected final MapSolrConfig, SolrRequestParsers parsers = new 
WeakHashMapSolrConfig, SolrRequestParsers();

to:

  protected final MapSolrConfig, SolrRequestParsers parsers = 
Collections.synchronizedMap(
  new WeakHashMapSolrConfig, SolrRequestParsers());

but there may be performance considerations around this since it is entrance 
into Solr.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5692) StackOverflowError during SolrCloud leader election process

2014-02-04 Thread Bojan Smid (JIRA)
Bojan Smid created SOLR-5692:


 Summary: StackOverflowError during SolrCloud leader election 
process
 Key: SOLR-5692
 URL: https://issues.apache.org/jira/browse/SOLR-5692
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6.1
Reporter: Bojan Smid


I have SolrCloud cluster with 7 nodes, each with few 1000 cores. I got this 
StackOverflow few times when starting one of the nodes (just a piece of stack 
trace, the rest repeats, leader election process obviously got stuck in 
infinite repetition of steps):

[2/4/14 3:42:43 PM] Bojan: 2014-02-04 15:18:01,947 
[localhost-startStop-1-EventThread] ERROR org.apache.zookeeper.ClientCnxn- 
Error while calling watcher 
java.lang.StackOverflowError
at java.security.AccessController.doPrivileged(Native Method)
at java.io.PrintWriter.init(PrintWriter.java:116)
at java.io.PrintWriter.init(PrintWriter.java:100)
at org.apache.solr.common.SolrException.toStr(SolrException.java:138)
at org.apache.solr.common.SolrException.log(SolrException.java:113)
[2/4/14 3:42:58 PM] Bojan: at 
org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:377)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
at 
org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
at 
org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
at 
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
 at 
org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
at 
org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
at 
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
at 
org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
at 
org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
at 
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
at 
org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
at 
org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
at 
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
at 
org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
at 
org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
at 
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2272) Join

2011-02-15 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994762#comment-12994762
 ] 

Bojan Smid commented on SOLR-2272:
--

Very nice patch Yonik. However, it doesn't apply on current trunk any more. 
Does anyone, by any chance, have a fresh version of this patch?

 Join
 

 Key: SOLR-2272
 URL: https://issues.apache.org/jira/browse/SOLR-2272
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Yonik Seeley
 Fix For: 4.0

 Attachments: SOLR-2272.patch


 Limited join functionality for Solr, mapping one set of IDs matching a query 
 to another set of IDs, based on the indexed tokens of the fields.
 Example:
 fq={!join  from=parent_ptr to:parent_id}child_doc:query

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2272) Join

2011-02-15 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994984#comment-12994984
 ] 

Bojan Smid commented on SOLR-2272:
--

Great, thx a lot Yonik :).

 Join
 

 Key: SOLR-2272
 URL: https://issues.apache.org/jira/browse/SOLR-2272
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Yonik Seeley
 Fix For: 4.0

 Attachments: SOLR-2272.patch, SOLR-2272.patch


 Limited join functionality for Solr, mapping one set of IDs matching a query 
 to another set of IDs, based on the indexed tokens of the fields.
 Example:
 fq={!join  from=parent_ptr to:parent_id}child_doc:query

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-236) Field collapsing

2008-06-21 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12606986#action_12606986
 ] 

Bojan Smid commented on SOLR-236:
-

You can check discussion about this same problem in the posts above (starting 
with 1st Feb 2008). It seems like a rather complex issue which could require 
some serious refactoring of collapsing code.

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Otis Gospodnetic
 Attachments: field-collapsing-extended-592129.patch, 
 field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, solr-236.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-236) Field collapsing

2008-06-07 Thread Bojan Smid (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bojan Smid updated SOLR-236:


Attachment: solr-236.patch

I updated the patch so that it can be compiled on Solr trunk. Also, since 
CollapseComponent essentially copied QueryComponent's prepare method (and it 
seems that it is supposed to be used instead of it), I made it extend 
QueryComponent (with collapsing-specific process() method, and prepare() method 
inherited from super class).

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Otis Gospodnetic
 Attachments: field-collapsing-extended-592129.patch, 
 field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, solr-236.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-06-05 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12602651#action_12602651
 ] 

Bojan Smid commented on SOLR-572:
-

File based spell checker would probably be used in cases when Solr index is too 
small or too young. So a user would compile a dictionary file (for instance, 
UNIX words file) and use it as a dictionary.

 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2008-05-25 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12599660#action_12599660
 ] 

Bojan Smid commented on SOLR-236:
-

I will try to bring this patch up to date. Currently I see two main problems:

1) The patch applies to trunk, but it doesn't compile. The problem occurs 
mainly because of changes in Search Components (for instance, some method 
signatures which CollapseComponent implements were changed). I have this fixed 
locally (more or less), but I have to test it before posting new version of 
patch.

2) It seems that CollapseComponent can't be used in chain with QueryComponent, 
but instead of it. CollapseComponent basically copies QueryComponent querying 
logic and adds some of it's own. I guess this isn't the right way to go. 
CollapseComponent should contain only collapsing logic and should be chainable 
with other components. Can anyone confirm if I'm right here? Of course, there 
might be some fundamental reason why CollapseComponent had to be implemented 
this way.

Does anyone else see any other issues with this component?

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Otis Gospodnetic
 Attachments: field-collapsing-extended-592129.patch, 
 field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2008-05-25 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12599714#action_12599714
 ] 

Bojan Smid commented on SOLR-236:
-

Hi Oleg. I'll look into this also. In case you have any working code, you can 
mail it to me, and I'll see what can be reused.

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Otis Gospodnetic
 Attachments: field-collapsing-extended-592129.patch, 
 field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-05-21 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12598728#action_12598728
 ] 

Bojan Smid commented on SOLR-572:
-

I already found the same problem, made a fix and sent it to Shalin, he will 
incorporate it into next patch when it's ready. If you specify field field 
type for that dictionary (and that field type can be found in Solr schema), 
you'll avoid the problem for now.

 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-05-21 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12598738#action_12598738
 ] 

Bojan Smid commented on SOLR-572:
-

Oleg, that field is now called fieldType, so something like str 
name=fieldTypeword/str should work for you as long as you have fileType 
with name word defined in your schema.xml. Let me know if this works.

 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-05-21 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12598752#action_12598752
 ] 

Bojan Smid commented on SOLR-572:
-

I noticed that when searching for suggestion for a word which exists in 
dictionary, SC returns some similar word instead of returning that same word. 
Old SCRH had field exist which returned true if word exists in the dictionary 
(so the client can treat it as correct word that doesn't need suggestion). 

We can't have exactly the same functionality here (since multi-word queries 
should be supported), but we can make SC return field spellingCorrect in case 
all words from the query exist in the dictionary. Otherwise, there is no way to 
know if spelling was correct or we should display suggestion.

There is a method in Lucene's SC to check if word exists in the index, so it's 
easy to check if word is correct. However, I'm also thinking of situation when 
we don't have just simple words in the query, for instance : toyata AND 
miles:[1 to 1], we want to check just toyata in the index, and return 
suggestion toyota AND miles:[1 to 1]. Other query types which might pose 
a problem are:
- fuzzy query
- wildcard query
- prefix query
...

 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-05-21 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12598835#action_12598835
 ] 

Bojan Smid commented on SOLR-572:
-

Sure, a quick fix can be done easily, but it probably wouldn't cover all 
possibilities, hence my post...

 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-05-19 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12597913#action_12597913
 ] 

Bojan Smid commented on SOLR-572:
-

I would like to add support for different character encodings in file-based 
dictionaries (current implementation will take system's default settings). I'm 
not sure how we'll synchronize your work with my fix? Can you let me know 
when/how can I start my work?

 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-572) Spell Checker as a Search Component

2008-05-19 Thread Bojan Smid (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bojan Smid updated SOLR-572:


Attachment: SOLR-572.patch

Character encodings for file-based dictionaries now supported with property 
characterEncoding. So, configuration for such dictionary would look like this:

{code:xml}
lst name=dictionary
str name=nameexternal/str
str name=typefile/str
str name=locationspellings.txt/str
str name=characterEncodingUTF-8/str
str name=spellcheckIndexDir c:\spellchecker/str
/lst
{code}

New code needs latest lucene-spellchecker-2.4*.jar from Lucene trunk.

Since SolrResourceLoader.getLines method doesn't support configurable encodings 
(treats everything as UTF-8), I wasn't sure how to add that support. I could 
have added overloaded method to SolrResourceLoader, but there is a TODO 
comment, so I decided to create getLines() method inside SpellCheckComponent 
class instead. What do you think of this?

 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-572) Spell Checker as a Search Component

2008-05-19 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12597930#action_12597930
 ] 

bosmid edited comment on SOLR-572 at 5/19/08 5:03 AM:
--

Character encodings for file-based dictionaries now supported with property 
characterEncoding. So, configuration for such dictionary would look like this:


{code:xml}
lst name=dictionary
str name=nameexternal/str
str name=typefile/str
str name=locationspellings.txt/str
str name=characterEncodingUTF-8/str
str name=spellcheckIndexDir c:\spellchecker/str
/lst
{code}

New code needs latest lucene-spellchecker-2.4*.jar from Lucene trunk.

Since SolrResourceLoader.getLines method doesn't support configurable encodings 
(treats everything as UTF-8), I wasn't sure how to add that support. I could 
have added overloaded method to SolrResourceLoader, but there is a TODO 
comment, so I decided to create getLines() method inside SpellCheckComponent 
class instead. What do you think of this?

  was (Author: bosmid):
Character encodings for file-based dictionaries now supported with property 
characterEncoding. So, configuration for such dictionary would look like this:

{code:xml}
lst name=dictionary
str name=nameexternal/str
str name=typefile/str
str name=locationspellings.txt/str
str name=characterEncodingUTF-8/str
str name=spellcheckIndexDir c:\spellchecker/str
/lst
{code}

New code needs latest lucene-spellchecker-2.4*.jar from Lucene trunk.

Since SolrResourceLoader.getLines method doesn't support configurable encodings 
(treats everything as UTF-8), I wasn't sure how to add that support. I could 
have added overloaded method to SolrResourceLoader, but there is a TODO 
comment, so I decided to create getLines() method inside SpellCheckComponent 
class instead. What do you think of this?
  
 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-572) Spell Checker as a Search Component

2008-05-19 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12597930#action_12597930
 ] 

bosmid edited comment on SOLR-572 at 5/19/08 5:05 AM:
--

Character encodings for file-based dictionaries now supported with property 
characterEncoding. So, configuration for such dictionary would look like this:

{code:xml}
lst name=dictionary
str name=nameexternal/str
str name=typefile/str
str name=sourceLocationspellings.txt/str
str name=characterEncodingUTF-8/str
str name=spellcheckIndexDir c:\spellchecker/str
/lst
{code}

New code needs latest lucene-spellchecker-2.4*.jar from Lucene trunk.

Since SolrResourceLoader.getLines method doesn't support configurable encodings 
(treats everything as UTF-8), I wasn't sure how to add that support. I could 
have added overloaded method to SolrResourceLoader, but there is a TODO 
comment, so I decided to create getLines() method inside SpellCheckComponent 
class instead. What do you think of this?

  was (Author: bosmid):
Character encodings for file-based dictionaries now supported with property 
characterEncoding. So, configuration for such dictionary would look like this:


{code:xml}
lst name=dictionary
str name=nameexternal/str
str name=typefile/str
str name=locationspellings.txt/str
str name=characterEncodingUTF-8/str
str name=spellcheckIndexDir c:\spellchecker/str
/lst
{code}

New code needs latest lucene-spellchecker-2.4*.jar from Lucene trunk.

Since SolrResourceLoader.getLines method doesn't support configurable encodings 
(treats everything as UTF-8), I wasn't sure how to add that support. I could 
have added overloaded method to SolrResourceLoader, but there is a TODO 
comment, so I decided to create getLines() method inside SpellCheckComponent 
class instead. What do you think of this?
  
 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-05-16 Thread Bojan Smid (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12597466#action_12597466
 ] 

Bojan Smid commented on SOLR-572:
-

The field attribute for file-based dictionary is basically the same field 
attribute as in default dictionary (in both cases they are used to obtain query 
analyzer), so that is the reason why I used the same name. My question was is 
it ok for default dictionary to use the same field to build dictionary from 
solr index and to obtain query analyzer for extracting tokens?

 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-553) Highlighter does not match phrase queries correctly

2008-05-15 Thread Bojan Smid (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bojan Smid updated SOLR-553:


Attachment: Solr-553.patch

Added unit test for this fix to the patch.

 Highlighter does not match phrase queries correctly
 ---

 Key: SOLR-553
 URL: https://issues.apache.org/jira/browse/SOLR-553
 Project: Solr
  Issue Type: New Feature
  Components: highlighter
Affects Versions: 1.2
 Environment: all
Reporter: Brian Whitman
Assignee: Otis Gospodnetic
 Attachments: highlighttest.xml, Solr-553.patch, Solr-553.patch


 http://www.nabble.com/highlighting-pt2%3A-returning-tokens-out-of-order-from-PhraseQuery-to16156718.html
 Say we search for the band I Love You But I've Chosen Darkness
 .../selectrows=100q=%22I%20Love%20You%20But%20I\'ve%20Chosen%20Darkness%22fq=type:htmlhl=truehl.fl=contenthl.fragsize=500hl.snippets=5hl.simple.pre=%3Cspan%3Ehl.simple.post=%3C/span%3E
 The highlight returns a snippet that does have the name altogether:
 Lights (Live) : spanI/span spanLove/span spanYou/span But 
 spanI've/span spanChosen/span spanDarkness/span :
 But also returns unrelated snips from the same page:
 Black Francis Shop spanI/span Think spanI/span spanLove/span 
 spanYou/span
 A correct highlighter should not return snippets that do not match the phrase 
 exactly.
 LUCENE-794 (not yet committed, but seems to be ready) fixes up the problem 
 from the Lucene end. Solr should get it too.
 Related: SOLR-575 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.