RE: question about highlight field

2007-06-06 Thread Xuesong Luo
Yes, I'm using 1.1. The example in my last email is an expected result,
not the real result. Indeed I didn't see the arr element in the
highlighting element when either prefix wildcard or true wildcard query
is used.
I just tried nightly build, as you said, it works great except for
prefix wildcard.

Thanks for your help!
Xuesong


-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 05, 2007 10:16 PM
To: solr-user@lucene.apache.org
Subject: RE: question about highlight field


: One more question about using wildcard. I found if wildcard is used in
: the query, the highlight elements only shows unique id, it won't
display

: lst name=highlighting
:  lst name=id1
:   arr name=TITLE
: stremConsult/emant/str

your description of the problem doesn't seem to match what you've pasted
... it looks like it's highlighting just the prefix from the query.

You're using Solr 1.1 right?

Unfortunately, i think you are damned if you do, damned if you don't ...
in Solr 1.1, highlighting used the info from the raw query to do
highlighting, hence in your query for consult* it would highlight
the Consult part of Consultant even though the prefix query was matchign
the whole word.  In the trunk (soon to be Solr 1.2) Mike fixed that so
the
query is rewritten to it's expanded form before highlighting is done
...
this works great for true wild card queries (ie: cons*t* or cons?lt*)
but
for prefix queries Solr has an optimization ofr Prefix queries (ie:
consult*) to reduce the likely hood of Solr crashing if the prefix
matches
a lot of terms ... unfortunately this breaks highlighting of prefix
queries, and no one has implemented a solution yet...

https://issues.apache.org/jira/browse/SOLR-195




-Hoss




custom writer, working but... a strange exception in logs

2007-06-06 Thread Frédéric Glorieux


Hi all,

At first, lucene user for years, I should really thanks you for Solr.

For a start, I wrote a little results writer for an app. It works like 
what I understand of Solr, except a strange exception I'm not able to 
puzzle.


Version : fresh subversion.
 1. Class
 2. stacktrace
 3. maybe ?

1. Class


public class HTMLResponseWriter implements QueryResponseWriter {
  public static String CONTENT_TYPE_HTML_UTF8 = text/html; charset=UTF-8;
  /** A custom HTML header configured from solrconfig.xml */
  static String HEAD;
  /** A custom HTML footer configured from solrconfig.xml */
  static String FOOT;

  /** get some snippets from conf */
  public void init(NamedList n) {
String s=(String)n.get(head);
if (s != null  !.equals(s)) HEAD = s;
s=(String)n.get(foot);
if (s != null  !.equals(s)) FOOT = s;
  }

  public void write(Writer writer, SolrQueryRequest req, 
SolrQueryResponse rsp)

  throws IOException {
// cause the exception below
writer.write(HEAD);
/* loop on my results, working like it should */
// cause the exception below
writer.write(FOOT);
  }

  public String getContentType(SolrQueryRequest request, 
SolrQueryResponse response) {

return CONTENT_TYPE_HTML_UTF8;
  }

}

2. Stacktrace
=

GRAVE: org.apache.solr.core.SolrException: Missing required parameter: q
	at 
org.apache.solr.request.RequiredSolrParams.get(RequiredSolrParams.java:50)
	at 
org.apache.solr.request.StandardRequestHandler.handleRequestBody(StandardRequestHandler.java:72)
	at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:77)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:658)
at org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:66)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
...

3. Maybe ?
==

I can't figure why, but when writer.write(HEAD) is executed, I see code 
from StandardRequestHandler executed 2 times in the debugger, first is 
OK, second hasn't the q parameter. Displaying results is always OK. 
Without such lines, there is only one call to StandardRequestHandler, no 
exception in log, but no more head or foot. When HEAD and FOOT values 
are hard coded and not configured, there's no exception. If HEAD and 
FOOT are not static, problem is the same.


Is it a mistake in my code ? Every piece of advice welcome, and if I 
touch a bug, be sure I will do my best to help.


--
Frédéric Glorieux
École nationale des chartes
direction des nouvelles technologies et de l'informatique


post.jar is absent in Solr distribution

2007-06-06 Thread Manoharam Reddy

I am an absolute noob to solr and I am trying out the Solr tutorial
present at http://lucene.apache.org/solr/tutorial.html

In the tutorial, post.jar is mentioned but I don't find post.jar
anywhere. I downloaded the solr tarball from
http://www.eu.apache.org/dist/lucene/solr/1.1/apache-solr-1.1.0-incubating.tgz

What do I do now?


Re: custom writer, working but... a strange exception in logs

2007-06-06 Thread Yonik Seeley

On 6/6/07, Frédéric Glorieux [EMAIL PROTECTED] wrote:

I can't figure why, but when writer.write(HEAD) is executed, I see code
from StandardRequestHandler executed 2 times in the debugger, first is
OK, second hasn't the q parameter.


I don't know why that would be... what is the client sending the request?
If it gets an error, does it retry or something?


Displaying results is always OK.
Without such lines, there is only one call to StandardRequestHandler, no
exception in log, but no more head or foot. When HEAD and FOOT values
are hard coded and not configured, there's no exception. If HEAD and
FOOT are not static, problem is the same.


I don't see a non-null default for HEAD/FOOT... perhaps
do   if (HEAD!=null) writer.write(HEAD);
There may be an issue with how you register in solrconfig.xml

-Yonik


Wildcards / Binary searches

2007-06-06 Thread galo

Hi,

Three questions:

1. I want to use solr for some sort of live search, querying with 
incomplete terms + wildcard and getting any similar results. Radioh* 
would return anything containing that string. The DisMax req. hander 
doesn't accept wildcards in the q param so i'm trying the simple one and 
still have problems as all my results are coming back with score = 1 and 
I need them sorted by relevance.. Is there a way of doing this? Why 
doesn't * work in dismax (nor ~ by the way)??


2. What do the phrase slop params do?

3. I'm trying to implement another index where I store a number of int 
values for each document. Everything works ok as integers but i'd like 
to have some sort of fuzzy searches based on the bit representation of 
the numbers. Essentially, this number:


1001001010100

would be compared to these two

1011001010100
1001001010111

And the first would get a bigger score than the second, as it has only 1 
flipped bit while the second has 2.


Is it possible to implement this in solr?


Cheers,
galo



RE: Wildcards / Binary searches

2007-06-06 Thread Xuesong Luo
I have a similar question about dismax, here is what Chris said:

the dismax handler uses a much more simplified query syntax then the
standard request handler.  Only +, -, and  are special characters so
wildcards are not supported.


HTH

-Original Message-
From: galo [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, June 06, 2007 8:41 AM
To: solr-user@lucene.apache.org
Subject: Wildcards / Binary searches

Hi,

Three questions:

1. I want to use solr for some sort of live search, querying with 
incomplete terms + wildcard and getting any similar results. Radioh* 
would return anything containing that string. The DisMax req. hander 
doesn't accept wildcards in the q param so i'm trying the simple one and

still have problems as all my results are coming back with score = 1 and

I need them sorted by relevance.. Is there a way of doing this? Why 
doesn't * work in dismax (nor ~ by the way)??

2. What do the phrase slop params do?

3. I'm trying to implement another index where I store a number of int 
values for each document. Everything works ok as integers but i'd like 
to have some sort of fuzzy searches based on the bit representation of 
the numbers. Essentially, this number:

1001001010100

would be compared to these two

1011001010100
1001001010111

And the first would get a bigger score than the second, as it has only 1

flipped bit while the second has 2.

Is it possible to implement this in solr?


Cheers,
galo




Re: Wildcards / Binary searches

2007-06-06 Thread J.J. Larrea
At 4:40 PM +0100 6/6/07, galo wrote:
1. I want to use solr for some sort of live search, querying with incomplete 
terms + wildcard and getting any similar results. Radioh* would return 
anything containing that string. The DisMax req. hander doesn't accept 
wildcards in the q param so i'm trying the simple one and still have problems 
as all my results are coming back with score = 1 and I need them sorted by 
relevance.. Is there a way of doing this? Why doesn't * work in dismax (nor ~ 
by the way)??

DisMax was written with the intent of supporting a simple search box in which 
one could type or paste some text, e.g. a title like

Santa Clause: Is he Real (and if so, what is real)?

and get meaningful results.  To do that it pre-processes the query string by 
removing unbalanced quotation marks and escaping characters that would 
otherwise be treated by the query parser as operators:

\ ! ( ) : ^ [ ] { } ~ * ?

I have a local version of DisMax which parameterizes the escaping so certain 
operators can be allowed through, which I'd be happy to contribute to you or 
the codebase, but I expect SimpleRH may be a better tool for your application 
than DisMaxRH, as long as you get it to score as you wish.

Both Standard and DisMax request handlers use SolrQueryParser, an extension of 
the Lucene query parser which introduces a small number of changes, one of 
which is that prefix queries e.g. Radioh* are evaluated with 
ConstantScorePrefixQuery rather than the standard PrefixQuery.

In issue SOLR-218 developers have been discussing per-field control of query 
parser options (some of it Solr's, some of it Lucene's).  When that is 
implemented there should additionally be a property useConstantScorePrefixQuery 
analogous to the unfortunately-named QueryParser useOldRangeQuery, but handled 
by SolrQueryParser (until CSPQs are implemented as an option in Lucene QP).

Until that time, well, Chris H. posted a clever and rather timely workaround on 
the solr-dev list:

one work arround people may want to consider ... is to force the use of a 
WildCardQuery in what would otherwise be interpreted as a PrefixQuery by 
putting a ? before the *

ie: auto?* instead of auto*

(yes, this does require that at least one character follow the prefix)

Perhaps that would help in your case?

- J.J.



Re: Wildcards / Binary searches

2007-06-06 Thread galo

Yeah i thought of that solution but this is a 20G index with each
document having around 300 or those numbers so i was a bit worried about
the performance.. I'll try anyway, thanks!

On 06/06/07, *Yonik Seeley* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] 
wrote:


On 6/6/07, galo [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote:
  3. I'm trying to implement another index where I store a number of
int
  values for each document. Everything works ok as integers but i'd
like
  to have some sort of fuzzy searches based on the bit representation of
  the numbers. Essentially, this number:

  1001001010100

  would be compared to these two

  1011001010100
  1001001010111

  And the first would get a bigger score than the second, as it has
only 1
  flipped bit while the second has 2.

You could store the numbers as a string field with the binary
representation,
then try a fuzzy search.

  myfield:1001001010100~

-Yonik






Re: custom writer, working but... a strange exception in logs

2007-06-06 Thread Frédéric Glorieux


Thanks for answer,

I'm feeling less guilty.

 I don't see a non-null default for HEAD/FOOT... perhaps
 do   if (HEAD!=null) writer.write(HEAD);
 There may be an issue with how you register in solrconfig.xml

I get every thing I want from solrconfig.xml, I was suspecting some 
classloader mystery. Following your advice from another post, I will 
write a specific request Handler, so it would be easier to trace the 
problem, with a very simple first solution, stop sending exception (to 
avoid gigabytes of logs).


--
Frédéric Glorieux
École nationale des chartes
direction des nouvelles technologies et de l'informatique


Re: Wildcards / Binary searches

2007-06-06 Thread galo

Ok further to my email below i've been testing with q=radioh?*

Basically the problem is, searching artists even with Radiohead having a 
big boost, it's returning stuff with less boost before like 
Radiohead+Ani Di Franco or Radiohead+Michael Stipe


The debug output is below, but basically, for Radiohead and one of the 
others we get this:


radiohead+ani - 655391.5  * 0.046359334
radiohead - 1150991.9 * 0.025442434

So it's fairly clear where is the difference. Looking at the numbers, 
the cause seems to be in this line:


8.781371 = idf(docFreq=4096)

While Radiohead+Ani is getting

16.000769 = idf(docFreq=2)

If I can alter this I think sorted.. what's idf and docFreq?


  str name=id=1200360,internal_docid=159496
30383.514 = (MATCH) sum of:
  30383.514 = (MATCH) weight(text:radiohead+ani in 159496), product of:
0.046359334 = queryWeight(text:radiohead+ani), product of:
  16.000769 = idf(docFreq=2)
  0.0028973192 = queryNorm
655391.5 = (MATCH) fieldWeight(text:radiohead+ani in 159496), 
product of:

  1.0 = tf(termFreq(text:radiohead+ani)=1)
  16.000769 = idf(docFreq=2)
  40960.0 = fieldNorm(field=text, doc=159496)
/str
  str name=id=979,internal_docid=9799640
29284.035 = (MATCH) sum of:
  29284.035 = (MATCH) weight(text:radiohead in 9799640), product of:
0.025442434 = queryWeight(text:radiohead), product of:
  8.781371 = idf(docFreq=4096)
  0.0028973192 = queryNorm
1150991.9 = (MATCH) fieldWeight(text:radiohead in 9799640), product of:
  1.0 = tf(termFreq(text:radiohead)=1)
  8.781371 = idf(docFreq=4096)
  131072.0 = fieldNorm(field=text, doc=9799640)
/str

Thanks a lot,

galo


galo wrote:
I was doing a different trick, basically searching q=radioh*+radioh~, 
and the results are slightly better than ?*, but not great. By the way, 
the case sensitiveness of wildcards affects here of course.


I'd like to have a look to that DisMax you have if you can post it, at 
least to compare results. The way I get to do scoring as I say is far 
from perfect.


By the way, I'm seeing the highlighting dissapears when using these 
wildcards, is that normal??


Thanks for your help,

galo


At 4:40 PM +0100 6/6/07, galo wrote:
 1. I want to use solr for some sort of live search, querying with 
incomplete terms + wildcard and getting any similar results. Radioh* 
would return anything containing that string. The DisMax req. hander 
doesn't accept wildcards in the q param so i'm trying the simple one 
and still have problems as all my results are coming back with score = 
1 and I need them sorted by relevance.. Is there a way of doing this? 
Why doesn't * work in dismax (nor ~ by the way)??


DisMax was written with the intent of supporting a simple search box 
in which one could type or paste some text, e.g. a title like


Santa Clause: Is he Real (and if so, what is real)?

and get meaningful results.  To do that it pre-processes the query 
string by removing unbalanced quotation marks and escaping characters 
that would otherwise be treated by the query parser as operators:


\ ! ( ) : ^ [ ] { } ~ * ?

I have a local version of DisMax which parameterizes the escaping so 
certain operators can be allowed through, which I'd be happy to 
contribute to you or the codebase, but I expect SimpleRH may be a 
better tool for your application than DisMaxRH, as long as you get it 
to score as you wish.


Both Standard and DisMax request handlers use SolrQueryParser, an 
extension of the Lucene query parser which introduces a small number 
of changes, one of which is that prefix queries e.g. Radioh* are 
evaluated with ConstantScorePrefixQuery rather than the standard 
PrefixQuery.


In issue SOLR-218 developers have been discussing per-field control of 
query parser options (some of it Solr's, some of it Lucene's).  When 
that is implemented there should additionally be a property 
useConstantScorePrefixQuery analogous to the unfortunately-named 
QueryParser useOldRangeQuery, but handled by SolrQueryParser (until 
CSPQs are implemented as an option in Lucene QP).


Until that time, well, Chris H. posted a clever and rather timely 
workaround on the solr-dev list:


 one work arround people may want to consider ... is to force the use 
of a WildCardQuery in what would otherwise be interpreted as a 
PrefixQuery by putting a ? before the *

 
 ie: auto?* instead of auto*
 
 (yes, this does require that at least one character follow the prefix)

Perhaps that would help in your case?

- J.J.










Re: post.jar is absent in Solr distribution

2007-06-06 Thread Chris Hostetter

: I am an absolute noob to solr and I am trying out the Solr tutorial
: present at http://lucene.apache.org/solr/tutorial.html

there is a big blurb on that tutorial URL attempting to point out that it
is for a nightly release (version 1.1.2007.05.29.12.05.29) and that you
should refer to the version of the tutorial that was distributed with the
release you are using.


-Hoss



RE: tomcat context fragment

2007-06-06 Thread Park, Michael
I've found the problem. 

The Context attribute path needed to be set:

Context path=/solr docBase=/users/mp15/solr.war debug=0
 crossContext=true
   Environment name=solr/home type=java.lang.String
value=/Users/mp15/solr override=true/
/Context


-Original Message-
From: Park, Michael [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 05, 2007 5:28 PM
To: solr-user@lucene.apache.org
Subject: tomcat context fragment

Hello All,

 

I've been working with solr on Tomcat 5.5/Windows and had success
setting my solr home using the context fragment.  However, I cannot get
it to work on Tomcat 5.028/Unix.  I've read and re-read the Apache
Tomcat documentation and cannot find a solution.  Has anyone run into
this issue?  Is there some Tomcat setting that is preventing this from
working?

 

Thanks,

Mike



Re: custom writer, working but... a strange exception in logs

2007-06-06 Thread Chris Hostetter

I'm baffled.

Would it be possible for you to send a scaled down (but compilable)
version of your response writer that demonstrates the problem, along with
a snippet that can be added to the example solrconfig.xml to register it
and and example request URL that triggers the problem?

that way we can all try ait and see if we can reproduce your results (for
all we know, it may be an artifact of your debugger)



-Hoss



Re: Wildcards / Binary searches

2007-06-06 Thread Chris Hostetter

: I have a local version of DisMax which parameterizes the escaping so
: certain operators can be allowed through, which I'd be happy to
: contribute to you or the codebase, but I expect SimpleRH may be a better

That sounds like it would be a really usefull patch if you be interested
in posting it to Jira.



-Hoss



RE: question about highlight field

2007-06-06 Thread Chris Hostetter

: Yes, I'm using 1.1. The example in my last email is an expected result,
: not the real result. Indeed I didn't see the arr element in the
: highlighting element when either prefix wildcard or true wildcard query

Hmmm... yes, i'm sorry i wasn't thinking clearly -- that makes sense since
in 1.1 the queries weren't being rewritten at all and so extractTerms
wouldn't work.



-Hoss



Re[2]: Multiple doc types in schema

2007-06-06 Thread Chris Hostetter

Ah  i was missunderstanding your goal of doctypes ... the use case i
was thinking is that you have book documents and movie documents
and you frequently only query on one type of the other but sometime you do
a generic query on all of them using the fields they have in common.

this is clearly not the situation you are describing, since you suggest
storing them in completley seperate indexes that can be blown away
independently.

there is a patch in Jira to support multiple SolrCore's in a single JVM
context ... as i understand it this would achieve your goal (but i
havne't really had a chance to look at it so i can't really speak to it.

in general, running multiple Solr isnt'ances is actaully wuite easy and
not as bad as you make it out to be ... the overhead of running multiple
Solr webapp instances in a single JVM doesn't really take up that much
more memory or CPU ... yes the classes are all loaded twice, but that
typically pales in comparison to the amount of data involved in your index
(unelss you've got hundrads of tiny indexes or something like that)

: - more difficult to maintain the index. If I want to delete
:   all docs of a doc type, I can use deletet by query but it's
:   always easier to wipe out the whole index directory if doctypes
:   are kept separate but maintained by the same solr instance.
:   I can run two separate solr instances to achieve this then this
:   takes more memory/CPU/maintaince effort.
:
: One schema file with doctypes defined, and separate index directories
: would be perfect, in my opinion :) or even separate schema files :)

-Hoss



RE: tomcat context fragment

2007-06-06 Thread Chris Hostetter
: I've found the problem.
:
: The Context attribute path needed to be set:
:
: Context path=/solr docBase=/users/mp15/solr.war debug=0

Michael, i don't really know much about tomcat, but is this becuase you
had a single config file for all context (the examples on our wiki suggest
that tomcat knows which contaxt path you want based on the name of hte
context fragment file)

is that something thta change between tomcat 5.0X and tomcat 5.5? (one
config file vs context fragment config files)



-Hoss



RE: tomcat context fragment

2007-06-06 Thread Park, Michael
Hi Chris,

No.  I set up a separate file, same as the wiki.  

It's either a tomcat version issue or a difference between how tomcat on
my Win laptop is configured vs. the configuration on our tomcat Unix
machine. 

I intend to run multiple instances of solr in production and wanted to
use the context fragments.

I have 3 test instances of solr running now (with 3 context files) and
found that whatever you set the path attribute to becomes the name of
the deployed web app (it doesn't have to match the name of the context
file, but cleaner to keep the names the same).

Here is what I found on the Apache site about this:
The context path of this web application, which is matched
against
the beginning of each request URI to select the appropriate web
application for processing. All of the context paths within a
particular Host must be unique. If you specify a context path of an
empty string (), you are defining the default web application for
this Host, which will process all requests not assigned to other
Contexts.

~Mike

-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, June 06, 2007 2:53 PM
To: solr-user@lucene.apache.org
Subject: RE: tomcat context fragment

: I've found the problem.
:
: The Context attribute path needed to be set:
:
: Context path=/solr docBase=/users/mp15/solr.war debug=0

Michael, i don't really know much about tomcat, but is this becuase you
had a single config file for all context (the examples on our wiki
suggest
that tomcat knows which contaxt path you want based on the name of hte
context fragment file)

is that something thta change between tomcat 5.0X and tomcat 5.5? (one
config file vs context fragment config files)



-Hoss



SOLVED Re: custom writer, working but... a strange exception in logs

2007-06-06 Thread Frédéric Glorieux


 I'm baffled.

[Yonic]
 I don't know why that would be... what is the client sending the request?
 If it gets an error, does it retry or something?

Good !
It's the favicon.ico effect.
Nothing in logs when the class is resquested from curl, but with a 
browser (here Opera), begin a response with html, and it requests for 
favicon.ico.




--
Frédéric Glorieux
École nationale des chartes
direction des nouvelles technologies et de l'informatique


Re: SOLVED Re: custom writer, working but... a strange exception in logs

2007-06-06 Thread Frédéric Glorieux

Frédéric Glorieux a écrit :


  I'm baffled.

[Yonic]
  I don't know why that would be... what is the client sending the 
request?

  If it gets an error, does it retry or something?

Good !


Nothing in logs when the class is resquested from curl, 


Sorry, same idea, but it's a CSS link.

--
Frédéric Glorieux
École nationale des chartes
direction des nouvelles technologies et de l'informatique


Re: SOLVED Re: custom writer, working but... a strange exception in logs

2007-06-06 Thread Chris Hostetter

: It's the favicon.ico effect.
: Nothing in logs when the class is resquested from curl, but with a
: browser (here Opera), begin a response with html, and it requests for
: favicon.ico.

HA HA HA HA that's freaking hilarious.

One way to avoid that might be to register a NOOP request handler with the
name /favicon.ico



-Hoss



Re: Wildcards / Binary searches

2007-06-06 Thread J.J. Larrea
Hi, Hoss.

I have a number of things I'd like to post... but the generally-useful stuff is 
unfortunately a bit interwoven with the special-case stuff, and I need to get 
out of breathing-down-my-back deadline mode to find the time to separate them, 
clean up and comment, make test cases, etc.  Hopefully next week I can post at 
least a modest contribution including this.

- J.J.

At 11:31 AM -0700 6/6/07, Chris Hostetter wrote:
: I have a local version of DisMax which parameterizes the escaping so
: certain operators can be allowed through, which I'd be happy to
: contribute to you or the codebase, but I expect SimpleRH may be a better

That sounds like it would be a really usefull patch if you be interested
in posting it to Jira.



-Hoss



RE: tomcat context fragment

2007-06-06 Thread Chris Hostetter
: Here is what I found on the Apache site about this:

...i think you are refering to...

http://tomcat.apache.org/tomcat-5.0-doc/config/context.html

..correct?  it definitely seems to be something that was changed in 5.5.
Note the added sentence in the 5.5 docs...

   The value of this field must not be set except when statically
defining a Context in server.xml, as it will be inferred from the
filenames used for either the .xml context file or the docBase.

I've updated the wiki with a small note about this...

http://wiki.apache.org/solr/SolrTomcat


-Hoss



Where to put my plugins?

2007-06-06 Thread Teruhiko Kurosaka
I made a plugin that has a Tokenizer, its Factory, a
Filter and its Factory.  I modified example/solr/conf/schema.xml
to use these Factories.

Following
http://wiki.apache.org/solr/SolrPlugins

I placed the plugin jar in the top level lib and ran
the start.jar.  I got:
org.mortbay.util.MultiException[org.apache.solr.core.SolrException:
Error loading class 'com.basistech.rlp.solr.RLPTokenizerFactory']

Clearly, Jetty cannot locate my plugin.

I put the jar in example/lib and got the same error.

After taking look at jetty document:
http://docs.codehaus.org/display/JETTY/Classloading
but not fully understanding it, I put the plugin jar
in example/ext.  Then I got:

java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.mortbay.start.Main.invokeMain(Main.java:151)
at org.mortbay.start.Main.start(Main.java:476)
at org.mortbay.start.Main.main(Main.java:94)
Caused by: java.lang.NoClassDefFoundError:
org/apache/solr/analysis/BaseTokenizerFactory


Better, Jetty can find my plugin, but it cannot load
one of the the Solr classes for it?

I also tried the old way described in the first doc,
expanding the war file, 
but the result was same as above. (Can't find
org/apache/solr/analysis/BaseTokenizerFactory)


Where am I supposed to put my Tokenizer/Filter plugin?

-kuro


RE: Where to put my plugins?

2007-06-06 Thread Teruhiko Kurosaka
This is about Solr 1.1.0 running on Win XP w/JDK 1.5.
Thank you.

 -Original Message-
 From: Teruhiko Kurosaka [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, June 06, 2007 5:32 PM
 To: solr-user@lucene.apache.org
 Subject: Where to put my plugins?
 
 I made a plugin that has a Tokenizer, its Factory, a Filter 
 and its Factory.  I modified example/solr/conf/schema.xml to 
 use these Factories.
 
 Following
 http://wiki.apache.org/solr/SolrPlugins
 
 I placed the plugin jar in the top level lib and ran the 
 start.jar.  I got:
 org.mortbay.util.MultiException[org.apache.solr.core.SolrException:
 Error loading class 'com.basistech.rlp.solr.RLPTokenizerFactory']
 
 Clearly, Jetty cannot locate my plugin.
 
 I put the jar in example/lib and got the same error.
 
 After taking look at jetty document:
 http://docs.codehaus.org/display/JETTY/Classloading
 but not fully understanding it, I put the plugin jar in 
 example/ext.  Then I got:
 
 java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccess
 orImpl.jav
 a:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMeth
 odAccessor
 Impl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:585)
 at org.mortbay.start.Main.invokeMain(Main.java:151)
 at org.mortbay.start.Main.start(Main.java:476)
 at org.mortbay.start.Main.main(Main.java:94)
 Caused by: java.lang.NoClassDefFoundError:
 org/apache/solr/analysis/BaseTokenizerFactory
 
 
 Better, Jetty can find my plugin, but it cannot load one of 
 the the Solr classes for it?
 
 I also tried the old way described in the first doc, 
 expanding the war file, but the result was same as above. (Can't find
 org/apache/solr/analysis/BaseTokenizerFactory)
 
 
 Where am I supposed to put my Tokenizer/Filter plugin?


Re: Where to put my plugins?

2007-06-06 Thread Ryan McKinley

If the example is in:
C:\workspace\solr\example

Try putting you custom .jar in:
C:\workspace\solr\example\solr\lib

Check the README in solr home:
C:\workspace\solr\example\solr\README.txt

 This directory is optional.  If it exists, Solr will load any Jars
 found in this directory and use them to resolve any plugins
 specified in your solrconfig.xml or schema.xml (ie: Analyzers,
 Request Handlers, etc...)



Teruhiko Kurosaka wrote:

I made a plugin that has a Tokenizer, its Factory, a
Filter and its Factory.  I modified example/solr/conf/schema.xml
to use these Factories.

Following
http://wiki.apache.org/solr/SolrPlugins

I placed the plugin jar in the top level lib and ran
the start.jar.  I got:
org.mortbay.util.MultiException[org.apache.solr.core.SolrException:
Error loading class 'com.basistech.rlp.solr.RLPTokenizerFactory']

Clearly, Jetty cannot locate my plugin.

I put the jar in example/lib and got the same error.

After taking look at jetty document:
http://docs.codehaus.org/display/JETTY/Classloading
but not fully understanding it, I put the plugin jar
in example/ext.  Then I got:

java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.mortbay.start.Main.invokeMain(Main.java:151)
at org.mortbay.start.Main.start(Main.java:476)
at org.mortbay.start.Main.main(Main.java:94)
Caused by: java.lang.NoClassDefFoundError:
org/apache/solr/analysis/BaseTokenizerFactory


Better, Jetty can find my plugin, but it cannot load
one of the the Solr classes for it?

I also tried the old way described in the first doc,
expanding the war file, 
but the result was same as above. (Can't find

org/apache/solr/analysis/BaseTokenizerFactory)


Where am I supposed to put my Tokenizer/Filter plugin?

-kuro





RE: Where to put my plugins?

2007-06-06 Thread Teruhiko Kurosaka
Ryan,
Thank you.

But creating lib under example/solr and placing
my plugin jar there yielded the same error of 
not able to locate 
org/apache/solr/analysis/BaseTokenizerFactory

How can this be
-kuro  



RE: Where to put my plugins?

2007-06-06 Thread Teruhiko Kurosaka
Never mind.  My mistake.  I still had a copy of the jar in ext dir.
After cleaning it up, it's now loading my plugin.

THANK YOU VERY MUCH!

 -Original Message-
 From: Teruhiko Kurosaka [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, June 06, 2007 5:58 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Where to put my plugins?
 
 Ryan,
 Thank you.
 
 But creating lib under example/solr and placing my plugin jar 
 there yielded the same error of not able to locate 
 org/apache/solr/analysis/BaseTokenizerFactory
 
 How can this be
 -kuro  
 
 


Re: SOLVED Re: custom writer, working but... a strange exception in logs

2007-06-06 Thread Erik Hatcher


On Jun 6, 2007, at 5:32 PM, Chris Hostetter wrote:



: It's the favicon.ico effect.
: Nothing in logs when the class is resquested from curl, but with a
: browser (here Opera), begin a response with html, and it  
requests for

: favicon.ico.

HA HA HA HA that's freaking hilarious.

One way to avoid that might be to register a NOOP request handler  
with the

name /favicon.ico


:)  maybe we should build one in that redirects to a solr.ico or  
something.





Wildcard not working as expected?

2007-06-06 Thread Nigel McNie
Hi,

I'm having trouble using a * wildcard after a term in a search. It does not
seem to match 0 or more, but rather something more, as long as it's not
nothing. This is using the standard query handler, by the way.

Examples:

Search for theatr* = returns 112 results, for things named 'theatre'
Search for theatre* = returns 0 results

Anyone know why this would be?

-- 
Regards,
Nigel McNie
Catalyst IT Ltd.
DDI: +64 4 803 2203


signature.asc
Description: Digital signature


solr+hadoop = next solr

2007-06-06 Thread James liu

anyone agree?

Next solr's development 's plan is? anyone know?


--
regards
jl


Re: solr+hadoop = next solr

2007-06-06 Thread Yonik Seeley

On 6/6/07, James liu [EMAIL PROTECTED] wrote:

anyone agree?


No ;-)

At least not if you mean using map-reduce for queries.

When I started looking at distributed search, I immediately went and
read the map-reduce paper (easier concept than it first appeared), and
realized it's really more for the indexing side of things (big batch
jobs, making data from data, etc).  Nutch uses map reduce for
crawling/indexing, but not for querying.

-Yonik


Re: Wildcard not working as expected?

2007-06-06 Thread Yonik Seeley

On 6/6/07, Nigel McNie [EMAIL PROTECTED] wrote:

I'm having trouble using a * wildcard after a term in a search. It does not
seem to match 0 or more, but rather something more, as long as it's not
nothing. This is using the standard query handler, by the way.

Examples:

Search for theatr* = returns 112 results, for things named 'theatre'
Search for theatre* = returns 0 results

Anyone know why this would be?


My guess would be stemming.
The indexed form of theatre is probably theatr after it goes through
the porter stemmer.

Perhaps ou could index another variant of the field (via copyField)
that just splits on whitespace and lowercases.

-Yonik


Re: solr+hadoop = next solr

2007-06-06 Thread Jeff Rodenburg

I've been exploring distributed search, as of late.  I don't know about the
next solr but I could certainly see a distributed solr grow out of such
an expansion.

In terms of the FederatedSearch wiki entry (updated last year), has there
been any progress made this year on this topic, at least something worthy of
being added or updated to the wiki page?  Not to splinter efforts here, but
maybe a working group that was focused on that topic could help to move
things forward a bit.

- j

On 6/6/07, Yonik Seeley [EMAIL PROTECTED] wrote:


On 6/6/07, James liu [EMAIL PROTECTED] wrote:
 anyone agree?

No ;-)

At least not if you mean using map-reduce for queries.

When I started looking at distributed search, I immediately went and
read the map-reduce paper (easier concept than it first appeared), and
realized it's really more for the indexing side of things (big batch
jobs, making data from data, etc).  Nutch uses map reduce for
crawling/indexing, but not for querying.

-Yonik



Re: solr+hadoop = next solr

2007-06-06 Thread James liu

2007/6/7, Yonik Seeley [EMAIL PROTECTED]:


On 6/6/07, James liu [EMAIL PROTECTED] wrote:
 anyone agree?

No ;-)

At least not if you mean using map-reduce for queries.

When I started looking at distributed search, I immediately went and
read the map-reduce paper (easier concept than it first appeared), and
realized it's really more for the indexing side of things (big batch
jobs, making data from data, etc).  Nutch uses map reduce for
crawling/indexing, but not for querying.



Yes, nutch use map reduce only for crawling/indexing, not for querying.


http://www.nabble.com/something-i-think-important-and-should-be-added-tf3813838.html#a10796136

map-reduce just for indexing to decrease Master solr query *instance *index
size and increase query speed.

It will cost many time to index and merge but it will increase query
accuracy.

index and data not in same box. so we just only sure master query server
hardware is powerful and
slave query server hardware is not very important.

Master index server should support multi index.

If solr support it.

I think user who use solr will quick setup their search.


It just my thought.

how do u think, yonik,,,and how do u think next solr?


-Yonik






--
regards
jl


Re: solr+hadoop = next solr

2007-06-06 Thread Yonik Seeley

On 6/6/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:

In terms of the FederatedSearch wiki entry (updated last year), has there
been any progress made this year on this topic, at least something worthy of
being added or updated to the wiki page?


Priorities shifted, and I dropped it for a while.
I recently started working with a CNET group that may need it, so I
could start working on it again in the next few months.  Don't wait
for me if you have ideas though... I'll try to follow along and chime
in.

-Yonik