Re: better stemming engine than Porter?

2008-04-22 Thread Mathieu Lecarme
Porter stemmer is not only agressive, it is ugly, too. The generated 
code is too old, too  few object centric and should be too slow.
If your kstem compile with java 1.4, why don't you suggest it to lucene 
core?


M.

Wagner,Harry a écrit :

Hi HH,
Here's a note I sent Solr-dev a while back:

---
I've implemented a Solr plug-in that wraps KStem for Solr use (someone
else had already written a Lucene wrapper for it).  KStem is considered
to be more appropriate for library usage since it is much less
aggressive than Porter (i.e., searches for organization do NOT match on
organ!). If there is any interest in feeding this back into Solr I would
be happy to contribute it.
---

I believe there was interest in it, but I never opened an issue for it
and I don't know if it was ever followed-up on. I'd be happy to do that
now. Can someone on the Solr-dev team point me in the right direction
for opening an issue?

Thanks... harry


-Original Message-
From: Hung Huynh [mailto:[EMAIL PROTECTED] 
Sent: Monday, April 21, 2008 11:59 AM

To: solr-user@lucene.apache.org
Subject: better stemming engine than Porter?

I recall I've read some where in one of the mailing-list archives that
some
one had developed a better stemming algo for Solr than the built-in
Porter
stemming. Does anyone have link to that stemming module? 


Thanks,

HH 





  




Re: CorruptIndexException

2008-04-22 Thread Michael McCandless
Robert Haschart [EMAIL PROTECTED] wrote:

  To answer your questions: I completely deleted the index each time
 before retesting.   and the java command as shown by ps does show -Xbatch.
  The program is running on:
   uname -a
  Linux lab8.betech.virginia.edu 2.6.18-53.1.14.el5 #1 SMP Tue Feb 19
 07:18:21 EST 2008 i686 i686 i386 GNU/Linux
   more /etc/redhat-release
  Red Hat Enterprise Linux Server release 5.1 (Tikanga)

  after downgrading from the originally reported version of java:   Java(TM)
 SE Runtime Environment (build 1.6.0_05-b13)
  to this one:
   java -version
  java version 1.6.0_02
  Java(TM) SE Runtime Environment (build 1.6.0_02-b05)
  Java HotSpot(TM) Server VM (build 1.6.0_02-b05, mixed mode)

  the indexing run sucessfully completed processing all 112 record chunks!
 Yea!
  (with -Xbatch on the command line, I didn't try with the 1.6.0_02 java
 without -Xbatch)

OK, that's good and bad news.  Good in that this still appears to be a
JVM issue (scary, really) since downgrading to 1.6.0_02 resolves it.
Bad in that -Xbatch is not always a viable workaround.  But at least
you have a way forward...

 So at this point it looks like the problem is in my marc-8 to utf-8
 translation code.  I'll look into this possibility further.

OK.  Let me know if this seems to come back to a Lucene issue!

Mike


Re: XSLT transform before update?

2008-04-22 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi ,

There is this new patch which implements these features. I shall
update the wiki with the documentation

I guess we do not need to be too worried about the memory consumption.
A few MB of memory should be fine (unless your are using a file which
is in 10's of MB ). Consider using XPathEntityProcessor (if possible )
it uses Stax and it is pretty efficient.
thanks for your support

--Noble

A few MB of memory for an xml must be fine. The XPathEnt

On Mon, Apr 21, 2008 at 5:57 PM, David Smiley @MITRE.org
[EMAIL PROTECTED] wrote:

  Cool.  So you're saying that this xslt file will operate on the entire XML
  document that was fetched from the URL and just pass it on to solr?  Thanks
  for supporting this.  The XML files I have coming from the my data source
  are big but not not too big to risk an out-of-memory error.  And I've found
  xslt to perform fast for me.  I like your proposed TemplateTransformer
  too... I'm tempted to use that in place of XSLT.  Great job Paul.

  It'd be neat to have an XSLT transformer for your framework that operates on
  a single entity (that addresses the memory usage problem).  I know your
  entities are HashMap based instead of XML, however.

  ~ David




  Noble Paul നോബിള്‍ नोब्ळ् wrote:
  
   We are planning to incorporate both your requests in the next patch.
   The implementation is going to be as follows.mention the xsl file
   location as follows
   entity processor=XPathEntitityprocessor xslt=file:/c:/my-own.xsl
   
   /entity
   So the processing will be done after the XSL transformation. If after
   your XSL transformation it produces a valid 'add' document not even
   fields is necessary. Otherwise you will need to write all the fields
   and their xpaths like any other xml
  
   entity processor=XPathEntitityprocessor xslt=file:/c:/my-own.xsl
   useSolrAddXml=true/
  
   So it will assume that the schema is same as that of the add xml and
   does the needful.
  
   Another feature is going to be a TemplateTransformer  which takes in a
   Template as follows
  
   entity name=e transformer=TemplateTransformer 
   field column=field1_2  template=${e.field1} ${e.field2}/
   /entity
  
   Please let us know what u think about this.
  
   And keep giving us these great use-cases so that we can make the tool
   better.
   --Noble
  
  
  
   On Mon, Apr 21, 2008 at 12:07 AM, David Smiley @MITRE.org
   [EMAIL PROTECTED] wrote:
  
Thanks Shalin.
  
The particular XSLT processor used is not relevant; it's a spec.  Just
   use
the standard Java APIs.  If I want a particular processor, then I can
   get
that to happen by using a system property and/or you could offer a
configuration input for the standard factory class implementation for a
processor of my choice.
  
~ David
  
  
  
  
Shalin Shekhar Mangar wrote:

 Hi David,
 Actually you can concatenate values, however you'll have to write a
   bit of
 code. You can write this in javascript (if you're using Java 6) or in
 Java.

 Basically, you need to write a Transformer to do it. Look at

   
 http://wiki.apache.org/solr/DataImportHandler#head-a6916b30b5d7605a990fb03c4ff461b3736496a9

 For example, lets say you get fields first-name and last-name in the
   XML.
 But in the schema.xml you have a field called name in which you need
   to
 concatenate the values of first-name and last-name (with a space in
 between). Create a Java class:

 public class ConcatenateTransformer { public Object
 transformRow(MapString,
 Object row) { String firstName = row.get(first-name); String
   lastName =
 row.get(last-name); row.put(name, firstName +   + lastName);
   return
 row; } }

 Add this class to solr's classpath by putting its jar in
   solr/WEB-INF/lib

 The data-config.xml should like this:
 entity name=myEntity processor=XPathEntityProcessor url=
 http://myurl/example.xml;
 transformer=com.yourpackage.ConcatenateTransformer field
 column=first-name xpath=/record/first-name / field
 column=last-name
 xpath=/record/last-name / field column=name / /entity

 This will call ConcatenateTransformer.transformRow method for each row
   and
 you can concatenate any field with any field (or constant). Note that
   solr
 document will keep only those fields which are in the schema.xml, the
   rest
 are thrown away.

 If you don't want to write this in Java, you can use JavaScript by
   using
 the
 built-in ScriptTransformer, for an example look at

   
 http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bdb6ff9

 However, I'm beginning to realize that XSLT is a common need, let me
   see
 how
 best we can accomodate it in DataImportHandler. Which XSLT processor
   will
 you prefer?

 On Sat, Apr 19, 2008 at 12:13 AM, David Smiley @MITRE.org
 [EMAIL PROTECTED]
 wrote:

RE: better stemming engine than Porter?

2008-04-22 Thread Wagner,Harry
Thanks Ryan. I just opened SOLR-546. Please let me know if I can provide
further help. Cheers! h

-Original Message-
From: Ryan McKinley [mailto:[EMAIL PROTECTED] 
Sent: Monday, April 21, 2008 2:33 PM
To: solr-user@lucene.apache.org
Subject: Re: better stemming engine than Porter?

Hey-

to create an issue, make an account on jira and post it...
https://issues.apache.org/jira/browse/SOLR

Give that a try and holler if you have trouble.

ryan



On Apr 21, 2008, at 12:31 PM, Wagner,Harry wrote:
 Hi HH,
 Here's a note I sent Solr-dev a while back:

 ---
 I've implemented a Solr plug-in that wraps KStem for Solr use (someone
 else had already written a Lucene wrapper for it).  KStem is  
 considered
 to be more appropriate for library usage since it is much less
 aggressive than Porter (i.e., searches for organization do NOT match  
 on
 organ!). If there is any interest in feeding this back into Solr I  
 would
 be happy to contribute it.
 ---

 I believe there was interest in it, but I never opened an issue for it
 and I don't know if it was ever followed-up on. I'd be happy to do  
 that
 now. Can someone on the Solr-dev team point me in the right direction
 for opening an issue?

 Thanks... harry


 -Original Message-
 From: Hung Huynh [mailto:[EMAIL PROTECTED]
 Sent: Monday, April 21, 2008 11:59 AM
 To: solr-user@lucene.apache.org
 Subject: better stemming engine than Porter?

 I recall I've read some where in one of the mailing-list archives that
 some
 one had developed a better stemming algo for Solr than the built-in
 Porter
 stemming. Does anyone have link to that stemming module?

 Thanks,

 HH








Re: More Like This boost

2008-04-22 Thread Erik Hatcher


On Apr 21, 2008, at 5:02 PM, Francisco Sanmartin wrote:
Is it possible to boost the query that MoreLikeThis returns before  
sending it to Solr? I mean, technically is possible, because you  
can add a factor to the whole query but...does it make sense?  
(Remember that MoreLikeThis can already boosts each term inside the  
query).


For example, this could be a result of MoreLikeThis (with native  
boosting enabled)


queryResultMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29  
morelikethis^0.67)


what I want to do is

queryResulltMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29  
morelikethis^0.67)^0.60  ---(notice the boost of 0.60 for the  
whole query)


That last boost wouldn't change the doc ordering at all, so it'd be  
kinda useless.


What are you trying to accomplish?

Erik



Re: More Like This boost

2008-04-22 Thread Francisco Sanmartin
I know that only one query of that type does not change anything. But 
when it's two or more with different boosts, i hope it does. Here is the 
situation:
My docs have Title and Description. What I want to do is to give 
more relevancy to the morelikethis on the title than on the description. 
So the query would be like this:


query = (words^0.4 in^0.3 the^0.56 title^0.65)^0.70 (words^0.7 in^0.33 
the^0.49 description^0.43)^0.30


This way, the words in the title are more relevant than the words in the 
description, right?


Thanks!

Pako


Erik Hatcher wrote:


On Apr 21, 2008, at 5:02 PM, Francisco Sanmartin wrote:
Is it possible to boost the query that MoreLikeThis returns before 
sending it to Solr? I mean, technically is possible, because you can 
add a factor to the whole query but...does it make sense? (Remember 
that MoreLikeThis can already boosts each term inside the query).


For example, this could be a result of MoreLikeThis (with native 
boosting enabled)


queryResultMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29 
morelikethis^0.67)


what I want to do is

queryResulltMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29 
morelikethis^0.67)^0.60  ---(notice the boost of 0.60 for the 
whole query)


That last boost wouldn't change the doc ordering at all, so it'd be 
kinda useless.


What are you trying to accomplish?

Erik






Re: More Like This boost

2008-04-22 Thread Erik Hatcher
No, the MLT feature does not have that kind of field-specific  
boosting capability.  It sounds like it could be a useful enhancement  
though.  Of course you do get boosts for interesting terms already,  
but maybe having an additional field-specific boost would be a nice  
touch too.


Erik

On Apr 22, 2008, at 9:13 AM, Francisco Sanmartin wrote:
I know that only one query of that type does not change anything.  
But when it's two or more with different boosts, i hope it does.  
Here is the situation:
My docs have Title and Description. What I want to do is to  
give more relevancy to the morelikethis on the title than on the  
description. So the query would be like this:


query = (words^0.4 in^0.3 the^0.56 title^0.65)^0.70 (words^0.7  
in^0.33 the^0.49 description^0.43)^0.30


This way, the words in the title are more relevant than the words  
in the description, right?


Thanks!

Pako


Erik Hatcher wrote:


On Apr 21, 2008, at 5:02 PM, Francisco Sanmartin wrote:
Is it possible to boost the query that MoreLikeThis returns  
before sending it to Solr? I mean, technically is possible,  
because you can add a factor to the whole query but...does it  
make sense? (Remember that MoreLikeThis can already boosts each  
term inside the query).


For example, this could be a result of MoreLikeThis (with native  
boosting enabled)


queryResultMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29  
morelikethis^0.67)


what I want to do is

queryResulltMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29  
morelikethis^0.67)^0.60  ---(notice the boost of 0.60 for  
the whole query)


That last boost wouldn't change the doc ordering at all, so it'd  
be kinda useless.


What are you trying to accomplish?

Erik






Re: More Like This boost

2008-04-22 Thread Walter Underwood
It should help to weight the terms with their frequency in the
original document. That will distinguish between two documents
with the same terms, but different focus.

wunder

On 4/22/08 7:46 AM, Erik Hatcher [EMAIL PROTECTED] wrote:

 No, the MLT feature does not have that kind of field-specific
 boosting capability.  It sounds like it could be a useful enhancement
 though.  Of course you do get boosts for interesting terms already,
 but maybe having an additional field-specific boost would be a nice
 touch too.
 
 Erik
 
 On Apr 22, 2008, at 9:13 AM, Francisco Sanmartin wrote:
 I know that only one query of that type does not change anything.
 But when it's two or more with different boosts, i hope it does.
 Here is the situation:
 My docs have Title and Description. What I want to do is to
 give more relevancy to the morelikethis on the title than on the
 description. So the query would be like this:
 
 query = (words^0.4 in^0.3 the^0.56 title^0.65)^0.70 (words^0.7
 in^0.33 the^0.49 description^0.43)^0.30
 
 This way, the words in the title are more relevant than the words
 in the description, right?
 
 Thanks!
 
 Pako
 
 
 Erik Hatcher wrote:
 
 On Apr 21, 2008, at 5:02 PM, Francisco Sanmartin wrote:
 Is it possible to boost the query that MoreLikeThis returns
 before sending it to Solr? I mean, technically is possible,
 because you can add a factor to the whole query but...does it
 make sense? (Remember that MoreLikeThis can already boosts each
 term inside the query).
 
 For example, this could be a result of MoreLikeThis (with native
 boosting enabled)
 
 queryResultMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29
 morelikethis^0.67)
 
 what I want to do is
 
 queryResulltMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29
 morelikethis^0.67)^0.60  ---(notice the boost of 0.60 for
 the whole query)
 
 That last boost wouldn't change the doc ordering at all, so it'd
 be kinda useless.
 
 What are you trying to accomplish?
 
 Erik
 
 
 



Enhancing the query language

2008-04-22 Thread Kamran Shadkhast

The kind usage we have in our seaching the contents news we need a more
sofisticated query language.
currently the solr query language is not enough for our needs.
I understand it is possible to add our own customized query parse to the
system, but I was wondering if anybody have done that and if there is any
idea to share how and from where to start.
for example we need to have :
paragraphs proximity i.e.   (termsgroup1) near/n (termgroup2)   termsgroup1
n paragraph apart from termgroup2
finding terms for number of times i.e.   atleast/n abcd in text   abcd
should show up atleast n times

Thanks,
Kamran shadkhast
-- 
View this message in context: 
http://www.nabble.com/Enhancing-the-query-language-tp16824860p16824860.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: better stemming engine than Porter?

2008-04-22 Thread Jay

Hi Wagner,

Thanks for the intro of KStem! I quickly scanned the original paper on 
KStem by Robert Krovetz but could not find any timing comparison data on
KStem and Porter stem. I wonder how slow/fast Kstem is compared to 
Porter stem based on your use in your application?


Jay

Wagner,Harry wrote:

Mathieu,
It's not my Kstem. It was written by someone at Umass, Amherst. More info here: 
http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi 


Someone else had already ported it to Lucene. I simply modified that wrapper to 
work with Solr. I'll open an issue for it so that it can (hopefully) be 
integrated into the project.

Cheers... harry

-Original Message-
From: Mathieu Lecarme [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 22, 2008 3:57 AM

To: solr-user@lucene.apache.org
Subject: Re: better stemming engine than Porter?

Porter stemmer is not only agressive, it is ugly, too. The generated 
code is too old, too  few object centric and should be too slow.
If your kstem compile with java 1.4, why don't you suggest it to lucene 
core?


M.

Wagner,Harry a écrit :

Hi HH,
Here's a note I sent Solr-dev a while back:

---
I've implemented a Solr plug-in that wraps KStem for Solr use (someone
else had already written a Lucene wrapper for it).  KStem is considered
to be more appropriate for library usage since it is much less
aggressive than Porter (i.e., searches for organization do NOT match on
organ!). If there is any interest in feeding this back into Solr I would
be happy to contribute it.
---

I believe there was interest in it, but I never opened an issue for it
and I don't know if it was ever followed-up on. I'd be happy to do that
now. Can someone on the Solr-dev team point me in the right direction
for opening an issue?

Thanks... harry


-Original Message-
From: Hung Huynh [mailto:[EMAIL PROTECTED] 
Sent: Monday, April 21, 2008 11:59 AM

To: solr-user@lucene.apache.org
Subject: better stemming engine than Porter?

I recall I've read some where in one of the mailing-list archives that
some
one had developed a better stemming algo for Solr than the built-in
Porter
stemming. Does anyone have link to that stemming module? 


Thanks,

HH 





  






RE: better stemming engine than Porter?

2008-04-22 Thread Wagner,Harry
Hi Jay,
I did not do a timing comparison either, but any change in performance after 
switching to Kstem was not noticeable.  Cheers... h

-Original Message-
From: Jay [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 22, 2008 12:26 PM
To: solr-user@lucene.apache.org
Subject: Re: better stemming engine than Porter?

Hi Wagner,

Thanks for the intro of KStem! I quickly scanned the original paper on 
KStem by Robert Krovetz but could not find any timing comparison data on
KStem and Porter stem. I wonder how slow/fast Kstem is compared to 
Porter stem based on your use in your application?

Jay

Wagner,Harry wrote:
 Mathieu,
 It's not my Kstem. It was written by someone at Umass, Amherst. More info 
 here: 
 http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi 
 
 Someone else had already ported it to Lucene. I simply modified that wrapper 
 to work with Solr. I'll open an issue for it so that it can (hopefully) be 
 integrated into the project.
 
 Cheers... harry
 
 -Original Message-
 From: Mathieu Lecarme [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, April 22, 2008 3:57 AM
 To: solr-user@lucene.apache.org
 Subject: Re: better stemming engine than Porter?
 
 Porter stemmer is not only agressive, it is ugly, too. The generated 
 code is too old, too  few object centric and should be too slow.
 If your kstem compile with java 1.4, why don't you suggest it to lucene 
 core?
 
 M.
 
 Wagner,Harry a écrit :
 Hi HH,
 Here's a note I sent Solr-dev a while back:

 ---
 I've implemented a Solr plug-in that wraps KStem for Solr use (someone
 else had already written a Lucene wrapper for it).  KStem is considered
 to be more appropriate for library usage since it is much less
 aggressive than Porter (i.e., searches for organization do NOT match on
 organ!). If there is any interest in feeding this back into Solr I would
 be happy to contribute it.
 ---

 I believe there was interest in it, but I never opened an issue for it
 and I don't know if it was ever followed-up on. I'd be happy to do that
 now. Can someone on the Solr-dev team point me in the right direction
 for opening an issue?

 Thanks... harry


 -Original Message-
 From: Hung Huynh [mailto:[EMAIL PROTECTED] 
 Sent: Monday, April 21, 2008 11:59 AM
 To: solr-user@lucene.apache.org
 Subject: better stemming engine than Porter?

 I recall I've read some where in one of the mailing-list archives that
 some
 one had developed a better stemming algo for Solr than the built-in
 Porter
 stemming. Does anyone have link to that stemming module? 

 Thanks,

 HH 




   
 
 
 




Re: Highlighted field gets truncated

2008-04-22 Thread Mike Klaas

On 19-Apr-08, at 3:02 AM, Christian Wittern wrote:

Mike Klaas wrote:


Fragments are generated independently from matching (I realize this  
isn't an ideal algorithm).


So it could be that the match is not part of the fragment?  This  
sounds a bit strange.  Is there a way to make sure the fragment  
contains the match other than returning the whole field and do the  
fragmenting myself?


The highlighting algorithm is as follows:
 1. fragment the whole field into N fragments
 2. score each fragment based on the keyword matches (more matches  
the better; prefer different keyword matching to many of the same  
keyword matching).  fragments that have no matching keywords do not  
have a positive score.

 3. return the top hl.maxSnippets fragments that score  0

As you can see, only fragments containing a match are returned (note  
that there is very often multiple matches--you seemed to assume only  
one).


-Mike


logging through log4j

2008-04-22 Thread Henrib

Hi,
I'm (still) seeking more advice on this deployment issue which is to use
org.apache.log4j instead of java.util.logging. I'm not seeking re-starting
any discussion on solr4j/commons/log4j/jul respective benefits; I'm seeking
a way to bridge jul to log4j with the minimum specific per-container
configuration or restriction.
I've failed to find a way that would work for all servlet containers
(Tomcat,WebSphere,Jetty) without disrupting SolrCode.
My last current attempt that requires code modification is posted in last
reply here
http://www.nabble.com/logging-through-log4j-to13747253.html#a16825364.
Comments/experience welcome.
Thanks
Henri
-- 
View this message in context: 
http://www.nabble.com/logging-through-log4j-tp16825424p16825424.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Highlighted field gets truncated

2008-04-22 Thread Christian Wittern

Mike Klaas wrote:

On 19-Apr-08, at 3:02 AM, Christian Wittern wrote:
So it could be that the match is not part of the fragment?  This 
sounds a bit strange.  Is there a way to make sure the fragment 
contains the match other than returning the whole field and do the 
fragmenting myself?



[...]
As you can see, only fragments containing a match are returned (note 
that there is very often multiple matches--you seemed to assume only 
one).


Mike, thank you for the clarification.  Now I understand what went wrong 
in the example I looked at.   I am querying ngram indexed  data (Chinese 
text).  A user enters two or three characters and expect them to be 
matched more or less as a substring match.  The fragment I looked at did 
contain only one of the characters (the other was cut off at the end), 
this is what made me wondering.   From what you say, even adding 
quotation marks around the query will not prevent this from happening 
(in this case, it would simply obscure the match). 

Are there any plans to improve the algorithm for fragmentation?  Or are 
there other work arounds?


All the best,

Christian



Re: better stemming engine than Porter?

2008-04-22 Thread Otis Gospodnetic
I actually doubt Porter's is slow.  From what I recall, it's a bunch of simple 
if/elses.

KStem can't get added to Lucene core due to its license (search Lucene JIRA for 
an issue that covered this several years ago).

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
 From: Mathieu Lecarme [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Tuesday, April 22, 2008 3:57:15 AM
 Subject: Re: better stemming engine than Porter?
 
 Porter stemmer is not only agressive, it is ugly, too. The generated 
 code is too old, too  few object centric and should be too slow.
 If your kstem compile with java 1.4, why don't you suggest it to lucene 
 core?
 
 M.
 
 Wagner,Harry a écrit :
  Hi HH,
  Here's a note I sent Solr-dev a while back:
 
  ---
  I've implemented a Solr plug-in that wraps KStem for Solr use (someone
  else had already written a Lucene wrapper for it).  KStem is considered
  to be more appropriate for library usage since it is much less
  aggressive than Porter (i.e., searches for organization do NOT match on
  organ!). If there is any interest in feeding this back into Solr I would
  be happy to contribute it.
  ---
 
  I believe there was interest in it, but I never opened an issue for it
  and I don't know if it was ever followed-up on. I'd be happy to do that
  now. Can someone on the Solr-dev team point me in the right direction
  for opening an issue?
 
  Thanks... harry
 
 
  -Original Message-
  From: Hung Huynh [mailto:[EMAIL PROTECTED] 
  Sent: Monday, April 21, 2008 11:59 AM
  To: solr-user@lucene.apache.org
  Subject: better stemming engine than Porter?
 
  I recall I've read some where in one of the mailing-list archives that
  some
  one had developed a better stemming algo for Solr than the built-in
  Porter
  stemming. Does anyone have link to that stemming module? 
 
  Thanks,
 
  HH 
 
 
 
 
   



Spellchecker Question

2008-04-22 Thread Matt Mitchell
I'm using the Spellchecker handler but am a little confused. The docs say to
run the cmd=rebuild when building the first time. Do I need to supply a q
param with that cmd=rebuild? The examples show a url with the q param set
while rebuilding, but the main section on the cmd param doesn't say much
about it. My hunch is that I need to supply a q?

Thanks,
Matt