Reindex Solr Using Tomcat

2010-11-18 Thread Eric Martin
Hi,

 

I searched google and the wiki to find out how I can force a full re-index
of all of my content and I came up with zilch. My goal is to be able to
adjust the weight settings, re-index  my entire database and then search my
site and view the results of my weight adjustments.

 

I am using Tomcat 5.x and Solr 1.4.1. Weird how I couldn't find this info. I
must have missed it. Anyone know where to find it?

 

Eric

 



RE: Reindex Solr Using Tomcat

2010-11-18 Thread Eric Martin
Ah, I am using an ApacheSolr module in Drupal and used nutch to insert the data 
into the Solr index. When I using Jetty I could just delete the data contents 
in sshd and then restart the service forcing the reindex.

Currently, the ApacheSolr module for Drupal allows for a 200 record re-index 
every cron run, but that is too slow for me. During implantation and testing I 
would prefer to re-index the entire database as I have over 400k records. 

I appreciate your help. My mind was searching for a command on the CLI that 
would just tell solr to reindex the entire dbase and be done with it.

-Original Message-
From: Ken Stanley [mailto:doh...@gmail.com] 
Sent: Thursday, November 18, 2010 12:37 PM
To: solr-user@lucene.apache.org
Subject: Re: Reindex Solr Using Tomcat

On Thu, Nov 18, 2010 at 3:33 PM, Eric Martin e...@makethembite.com wrote:
 Hi,



 I searched google and the wiki to find out how I can force a full re-index
 of all of my content and I came up with zilch. My goal is to be able to
 adjust the weight settings, re-index  my entire database and then search my
 site and view the results of my weight adjustments.



 I am using Tomcat 5.x and Solr 1.4.1. Weird how I couldn't find this info. I
 must have missed it. Anyone know where to find it?



 Eric


Eric,

How you re-index SOLR determines which method you wish to use. You can
either use the UpdateHandler using a POST of an XML file [1], or you
can use the DataImportHandler (DIH) [2]. There exist other means, but
these two should be sufficient to get started. How did you import your
initial index in the first place?

[1] http://wiki.apache.org/solr/UpdateXmlMessages
[2] http://wiki.apache.org/solr/DataImportHandler



RE: Spell Checker

2010-11-17 Thread Eric Martin
Like a charm Dan, like a charm. I'm going to write this up and post it on
Drupal. Thanks a ton! I have a much better idea of Solr and Did You Mean,
Spell checker

-Original Message-
From: Dan Lynn [mailto:d...@danlynn.com] 
Sent: Tuesday, November 16, 2010 5:21 PM
To: solr-user@lucene.apache.org
Subject: Re: Spell Checker

See interjected responses below

On 11/16/2010 06:14 PM, Eric Martin wrote:
 Thanks Dan! Few questions:

 Use acopyField   to divert your main text fields to the spell field and
 then configure your spell checker to use the spell field to derive the
 spelling index.
Right. A copyField just copies data from one field to another during the 
indexing process. You can copy one field to n other fields without 
affecting the original.
 This will still keep my current copyfield for the same data, right?

 I don't need to rebuild, just reindex.

  After this, you'll need to query a spellcheck-enabled handler with
 spellcheck.build=true or enable spellchecker index builds during
optimize.
If you are using the default solrconfig.xml, a requesthandler should 
already be set up for you (but you should need a dedicated one for 
production: you can just embed the spell checker component in your 
default handler). Just query the example like this:

http://localhost:8983/solr/spell?q=ANYTHINGHEREspellcheck=truespellcheck.c
ollate=truespellcheck.build=true

Note the spellcheck.build=true parameter.

Cheers,
Dan
http://twitter.com/danklynn


 Totally lost on that.

 I will buy a book here shortly.

 -Original Message-
 From: Dan Lynn [mailto:d...@danlynn.com]
 Sent: Tuesday, November 16, 2010 5:01 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Spell Checker

 I had to deal with spellchecking today a bit. Make sure you are
 performing the analysis step at index-time as such:

 schema.xml:

  fieldType name=textSpell class=solr.TextField
  positionIncrementGap=100 omitNorms=true
  analyzer type=index
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.StandardFilterFactory/
  /analyzer
  analyzer type=query
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
  ignoreCase=true expand=true/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.StandardFilterFactory/
  /analyzer
  /fieldType

 fields
   .
 field name=spell type=textSpell indexed=true stored=false
 multiValued=true/
 /fields

From http://wiki.apache.org/solr/SpellCheckingAnalysis:

  Use acopyField   to divert your main text fields to the spell field
and
 then configure your spell checker to use the spell field to derive the
 spelling index.


 After this, you'll need to query a spellcheck-enabled handler with
 spellcheck.build=true or enable spellchecker index builds during optimize.

 Hope this helps,

 Dan Lynn
 http://twitter.com/danklynn


 On 11/16/2010 05:45 PM, Eric Martin wrote:
 Hi (again)



 I am looking at the spell checker options:





http://wiki.apache.org/solr/SpellCheckerRequestHandler#Term_Source_Configura
 tion



 http://wiki.apache.org/solr/SpellCheckComponent#Use_in_the_Solr_Example



 I am looking in my solrconfig.xml and I see one is already in use. I am
 kind
 of confused by this because the recommended spell checker is not default
 in
 my Solr 1.4.1. I have read the documentation but am still fuzzy on what I
 should do.



 My site uses legal terms and as you can see, some terms don't jive with
 the
 default spell checker so I was hoping to map the spell checker to the
body
 for referencing dictionary words. I am unclear what approach I should
take
 and how to start the quest.



 Can someone clarify what I should be doing here? Am I on the right track?



 Eric







Spell Checker

2010-11-16 Thread Eric Martin
Hi (again)

 

I am looking at the spell checker options:

 

http://wiki.apache.org/solr/SpellCheckerRequestHandler#Term_Source_Configura
tion

 

http://wiki.apache.org/solr/SpellCheckComponent#Use_in_the_Solr_Example

 

I am looking in my solrconfig.xml and I see one is already in use. I am kind
of confused by this because the recommended spell checker is not default in
my Solr 1.4.1. I have read the documentation but am still fuzzy on what I
should do.

 

My site uses legal terms and as you can see, some terms don't jive with the
default spell checker so I was hoping to map the spell checker to the body
for referencing dictionary words. I am unclear what approach I should take
and how to start the quest.

 

Can someone clarify what I should be doing here? Am I on the right track?

 

Eric



RE: Spell Checker

2010-11-16 Thread Eric Martin
Thanks Dan! Few questions:

Use acopyField  to divert your main text fields to the spell field and
then configure your spell checker to use the spell field to derive the
spelling index.
This will still keep my current copyfield for the same data, right?

I don't need to rebuild, just reindex.

 After this, you'll need to query a spellcheck-enabled handler with
spellcheck.build=true or enable spellchecker index builds during optimize.

Totally lost on that. 

I will buy a book here shortly.

-Original Message-
From: Dan Lynn [mailto:d...@danlynn.com] 
Sent: Tuesday, November 16, 2010 5:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Spell Checker

I had to deal with spellchecking today a bit. Make sure you are 
performing the analysis step at index-time as such:

schema.xml:

fieldType name=textSpell class=solr.TextField
positionIncrementGap=100 omitNorms=true
analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.StandardFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.StandardFilterFactory/
/analyzer
/fieldType

fields
 .
field name=spell type=textSpell indexed=true stored=false 
multiValued=true/
/fields

 From http://wiki.apache.org/solr/SpellCheckingAnalysis:

Use acopyField  to divert your main text fields to the spell field and
then configure your spell checker to use the spell field to derive the
spelling index.


After this, you'll need to query a spellcheck-enabled handler with 
spellcheck.build=true or enable spellchecker index builds during optimize.

Hope this helps,

Dan Lynn
http://twitter.com/danklynn


On 11/16/2010 05:45 PM, Eric Martin wrote:
 Hi (again)



 I am looking at the spell checker options:




http://wiki.apache.org/solr/SpellCheckerRequestHandler#Term_Source_Configura
 tion



 http://wiki.apache.org/solr/SpellCheckComponent#Use_in_the_Solr_Example



 I am looking in my solrconfig.xml and I see one is already in use. I am
kind
 of confused by this because the recommended spell checker is not default
in
 my Solr 1.4.1. I have read the documentation but am still fuzzy on what I
 should do.



 My site uses legal terms and as you can see, some terms don't jive with
the
 default spell checker so I was hoping to map the spell checker to the body
 for referencing dictionary words. I am unclear what approach I should take
 and how to start the quest.



 Can someone clarify what I should be doing here? Am I on the right track?



 Eric






RE: Spell Checker

2010-11-16 Thread Eric Martin
Ah, I thought I was going nuts. Thanks for clarifying about the Wiki. 

-Original Message-
From: Markus Jelsma [mailto:markus.jel...@openindex.io] 
Sent: Tuesday, November 16, 2010 5:11 PM
To: solr-user@lucene.apache.org
Subject: Re: Spell Checker


 Hi (again)
 
 
 
 I am looking at the spell checker options:
 
 
 

http://wiki.apache.org/solr/SpellCheckerRequestHandler#Term_Source_Configur
 a tion
 
 
 
 http://wiki.apache.org/solr/SpellCheckComponent#Use_in_the_Solr_Example
 
 
 
 I am looking in my solrconfig.xml and I see one is already in use. I am
 kind of confused by this because the recommended spell checker is not
 default in my Solr 1.4.1. I have read the documentation but am still fuzzy
 on what I should do.
 

Yes, the wiki on the request handler can be confusing indeed as it discusses

the spellchecker as a request handler instead of a component. Usually,
people 
need the spellchecker just as a component in some request handler instead of
a 
request handler specifically designed for only spellchecking. I'd forget
about 
that wiki and just follow the spellcheck component wiki as it not only 
describes the request handler but also the component, and it is being 
maintained up to the most recent developments in trunk and branch 3.1.

 
 
 My site uses legal terms and as you can see, some terms don't jive with
the
 default spell checker so I was hoping to map the spell checker to the body
 for referencing dictionary words. I am unclear what approach I should take
 and how to start the quest.

Map the spellchecker to the body of what? I assume the body of your document

where the `main content` is stored. In that case, you'd just follow the wiki

on the component and create a spellchecking fieldType with proper analyzers 
(the example allright) and define a spellchecking field that has the
spellcheck 
fieldType as type (again, like in the example).

Then you'll need to configure the spellchecking component in your
solrconfig. 
The example is, again, what you're looking for. All you need to map your 
document's main body to the spellchecker is a copyField directive in your 
schema which will copy your body field to the spellcheck field (which has
the 
spellcheck fieldType).

The example on the component wiki page should work. Many features have been 
added since 1.4.x but the examples should work as expected.


 
 
 
 Can someone clarify what I should be doing here? Am I on the right track?
 
 
 
 Eric



RE: Spell Checker

2010-11-16 Thread Eric Martin
Hi:

Ok, I made the changes and have the spell checker build on optimize set to
true. So I guess now, I just reindex. I have to run to class now so I can't
check it for another 30 minutes. Cheers!

-Original Message-
From: Dan Lynn [mailto:d...@danlynn.com] 
Sent: Tuesday, November 16, 2010 5:21 PM
To: solr-user@lucene.apache.org
Subject: Re: Spell Checker

See interjected responses below

On 11/16/2010 06:14 PM, Eric Martin wrote:
 Thanks Dan! Few questions:

 Use acopyField   to divert your main text fields to the spell field and
 then configure your spell checker to use the spell field to derive the
 spelling index.
Right. A copyField just copies data from one field to another during the 
indexing process. You can copy one field to n other fields without 
affecting the original.
 This will still keep my current copyfield for the same data, right?

 I don't need to rebuild, just reindex.

  After this, you'll need to query a spellcheck-enabled handler with
 spellcheck.build=true or enable spellchecker index builds during
optimize.
If you are using the default solrconfig.xml, a requesthandler should 
already be set up for you (but you should need a dedicated one for 
production: you can just embed the spell checker component in your 
default handler). Just query the example like this:

http://localhost:8983/solr/spell?q=ANYTHINGHEREspellcheck=truespellcheck.c
ollate=truespellcheck.build=true

Note the spellcheck.build=true parameter.

Cheers,
Dan
http://twitter.com/danklynn


 Totally lost on that.

 I will buy a book here shortly.

 -Original Message-
 From: Dan Lynn [mailto:d...@danlynn.com]
 Sent: Tuesday, November 16, 2010 5:01 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Spell Checker

 I had to deal with spellchecking today a bit. Make sure you are
 performing the analysis step at index-time as such:

 schema.xml:

  fieldType name=textSpell class=solr.TextField
  positionIncrementGap=100 omitNorms=true
  analyzer type=index
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.StandardFilterFactory/
  /analyzer
  analyzer type=query
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
  ignoreCase=true expand=true/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.StandardFilterFactory/
  /analyzer
  /fieldType

 fields
   .
 field name=spell type=textSpell indexed=true stored=false
 multiValued=true/
 /fields

From http://wiki.apache.org/solr/SpellCheckingAnalysis:

  Use acopyField   to divert your main text fields to the spell field
and
 then configure your spell checker to use the spell field to derive the
 spelling index.


 After this, you'll need to query a spellcheck-enabled handler with
 spellcheck.build=true or enable spellchecker index builds during optimize.

 Hope this helps,

 Dan Lynn
 http://twitter.com/danklynn


 On 11/16/2010 05:45 PM, Eric Martin wrote:
 Hi (again)



 I am looking at the spell checker options:





http://wiki.apache.org/solr/SpellCheckerRequestHandler#Term_Source_Configura
 tion



 http://wiki.apache.org/solr/SpellCheckComponent#Use_in_the_Solr_Example



 I am looking in my solrconfig.xml and I see one is already in use. I am
 kind
 of confused by this because the recommended spell checker is not default
 in
 my Solr 1.4.1. I have read the documentation but am still fuzzy on what I
 should do.



 My site uses legal terms and as you can see, some terms don't jive with
 the
 default spell checker so I was hoping to map the spell checker to the
body
 for referencing dictionary words. I am unclear what approach I should
take
 and how to start the quest.



 Can someone clarify what I should be doing here? Am I on the right track?



 Eric







Error When Switching to Tomcat

2010-11-14 Thread Eric Martin
Hi,

 

I have been using Jetty on my linux/apache webserver for about 3 weeks now.
I decided that I should change to Tomcat after realizing I will be indexing
a lot of URL's and Jetty is good for small production sites as noted in the
Wiki. I am running into this error:

 

org.apache.solr.common.SolrException: Schema Parsing Failed at
org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:656) at
org.apache.solr.schema.IndexSchema.init(IndexSchema.java:95) at
org.apache.solr.core.SolrCore.init

 

My localhost/solr.xml :

 

Context docBase=/tomcat/webapps/solr.war debug=0 privileged=true
allowLinking=true crossContext=true

Environment name=solr/home type=java.lang.String
value=/tomcat/webapps/solr/ override=true /

/Context

 

My solrconfig.xml:

 

dataDir${solr.data.dir:/tomcat/webapps/solr/conf}/dataDir

I can get to the 8080 Tomcat default page just fine.  I've gone over the
Wiki a couple of dozen times and verified that my solr.xml is configured
correctly based on trial and error and reading the error logs. I just can't
figure out where it is going wrong. I read there are three different ways to
do this. Can someone help me out?

 

I am using Solr 1.4.0 and Tomcat 5.5.30

 

Eric



RE: Error When Switching to Tomcat

2010-11-14 Thread Eric Martin
Hi,

Thank you! I got it working after you jarred my brain. Of course, the
location of the solr instance is arbitrary/logical to tomcat. Sheesh, I feel
kind of small, now. Anyway, I was able to clearly see my mistake from your
information.

As with all help I get from here I posted my fix/walkthrough for others to
see here:

http://drupal.org/node/716632

Thanks a bunch! You helped me and anyone else coming to the Drupal site for
help with Tomcat and Solr :-)

-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com] 
Sent: Sunday, November 14, 2010 2:23 AM
To: solr-user@lucene.apache.org
Subject: Re: Error When Switching to Tomcat

Move solr.war file and solrhome directory somewhere else outside the tomcat
webapps. Like /home/foo. Tomcat will generate webapps/solr automatically.

This is what i use: under catalineHome/conf/Catalina/localhost/solr.xml

Context docBase=/home/foo/apache-solr-1.4.0.war debug=0
crossContext=true 
   Environment name=solr/home type=java.lang.String
value=/home/foo/SoorHome override=true /
/Context

I also delete dataDir.../dataDir  entry from solrconfig.xml. So that
data dir is created under the solrhome directory.

http://wiki.apache.org/solr/SolrTomcat#Installing_Solr_instances_under_Tomca
t


 I have been using Jetty on my linux/apache webserver for
 about 3 weeks now.
 I decided that I should change to Tomcat after realizing I
 will be indexing
 a lot of URL's and Jetty is good for small production sites
 as noted in the
 Wiki. I am running into this error:
 
  
 
 org.apache.solr.common.SolrException: Schema Parsing Failed
 at
 org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:656)
 at
 org.apache.solr.schema.IndexSchema.init(IndexSchema.java:95)
 at
 org.apache.solr.core.SolrCore.init
 
  
 
 My localhost/solr.xml :
 
  
 
 Context docBase=/tomcat/webapps/solr.war debug=0
 privileged=true
 allowLinking=true crossContext=true
 
 Environment name=solr/home type=java.lang.String
 value=/tomcat/webapps/solr/ override=true /
 
 /Context
 
  
 
 My solrconfig.xml:
 
  
 
 dataDir${solr.data.dir:/tomcat/webapps/solr/conf}/dataDir
 
 I can get to the 8080 Tomcat default page just fine. 
 I've gone over the
 Wiki a couple of dozen times and verified that my solr.xml
 is configured
 correctly based on trial and error and reading the error
 logs. I just can't
 figure out where it is going wrong. I read there are three
 different ways to
 do this. Can someone help me out?


  



Search Result Differences a Puzzle

2010-11-11 Thread Eric Martin
Hi,

 

I cannot find out how this is occurring: 

 

 

Nolosearch/com/search/apachesolr_search/law

 

 

You can see that the John Paul Stevens result yields more description in the
search result because of the keyword relevancy, whereas, the other results
just give you a snippet of the title based on keywords found. 

 

I am trying to figure out how to get a standard size search result no matter
what the relevancy is. While application of this type of result would be
irrelevant to many search engines it is completely practical in a legal
setting as a keyword is only as good as how it is being referenced in the
sentence or paragraph. What a dilemma I have!

 

 

I have been trying to figure out if it is the actual schema.xml file or
solrconfig.xml file and for the life of me, I can't find it referenced
anywhere. I tried changing the fragsize to 200 instead of a default of like
70. Didn't do any damage at re-index.

 

 

This problem is super critical to my search results. Like I said, as an
attorney, the word is superfluous until it attached to a long sentence or
two in order to describe if the keyword we searched for is relevant, let
alone worthy of  a click. That is why my titles are set to open in a new
window, faster access and if the result is crud, then just close the window
out and back to research.

 

 

Eric

 



RE: importing from java

2010-11-11 Thread Eric Martin
http://wiki.apache.org/solr/DIHQuickStart
http://wiki.apache.org/solr/DataImportHandlerFaq
http://wiki.apache.org/solr/DataImportHandler


-Original Message-
From: Tri Nguyen [mailto:tringuye...@yahoo.com] 
Sent: Thursday, November 11, 2010 9:34 PM
To: solr-user@lucene.apache.org
Subject: Re: importing from java

another question is, can I write my own DataImportHandler class?

thanks,

Tri





From: Tri Nguyen tringuye...@yahoo.com
To: solr user solr-user@lucene.apache.org
Sent: Thu, November 11, 2010 7:01:25 PM
Subject: importing from java

Hi,

I'm restricted to the following in regards to importing.

I have access to a list (Iterator) of Java objects I need to import into
solr.

Can I import the java objects as part of solr's data import interface
(whenever 
an http request to solr to do a dataimport, it'll call my java class to get 
objects)?  


Before I had direct read only access to the db and specified the column
mappings 

and things were fine with the data import.  


But now I am restricted to using a .jar file that has an api to get the
records 
in the database and I need to publish these records in the db.  I do see
solrj 
and but solrj is seaparate from the solr webapp.

Can I write my own dataimporthandler?

Thanks,

Tri



RE: solr init.d script

2010-11-08 Thread Eric Martin
Er, what flavor?

RHEL / CentOS

#!/bin/sh

# Starts, stops, and restarts Apache Solr.
#
# chkconfig: 35 92 08
# description: Starts and stops Apache Solr

SOLR_DIR=/var/solr
JAVA_OPTIONS=-Xmx1024m -DSTOP.PORT=8079 -DSTOP.KEY=mustard -jar start.jar
LOG_FILE=/var/log/solr.log
JAVA=/usr/bin/java

case $1 in
start)
echo Starting Solr
cd $SOLR_DIR
$JAVA $JAVA_OPTIONS 2 $LOG_FILE 
;;
stop)
echo Stopping Solr
cd $SOLR_DIR
$JAVA $JAVA_OPTIONS --stop
;;
restart)
$0 stop
sleep 1
$0 start
;;
*)
echo Usage: $0 {start|stop|restart} 2
exit 1
;;
esac




Debian

http://xdeb.org/node/1213

__

Ubuntu

STEPS
Type in the following command in TERMINAL to install nano text editor.
sudo apt-get install nano
Type in the following command in TERMINAL to add a new script.
sudo nano /etc/init.d/solr
TERMINAL will display a new page title GNU nano 2.0.x.
Paste the below script in this TERMINAL window.
#!/bin/sh -e 

# Starts, stops, and restarts solr 

SOLR_DIR=/apache-solr-1.4.0/example 
JAVA_OPTIONS=-Xmx1024m -DSTOP.PORT=8079 -DSTOP.KEY=stopkey -jar start.jar 
LOG_FILE=/var/log/solr.log 
JAVA=/usr/bin/java 

case $1 in 
start) 
echo Starting Solr 
cd $SOLR_DIR 
$JAVA $JAVA_OPTIONS 2 $LOG_FILE  
;; 
stop) 
echo Stopping Solr 
cd $SOLR_DIR 
$JAVA $JAVA_OPTIONS --stop 
;; 
restart) 
$0 stop 
sleep 1 
$0 start 
;; 
*) 
echo Usage: $0 {start|stop|restart} 2 
exit 1 
;; 
esac
Note: In above script you might have to replace /apache-solr-1.4.0/example
with appropriate directory name.
Press CTRL-X keys.
Type in Y
When ask File Name to Write press ENTER key.
You're now back to TERMINAL command line.

Type in the following command in TERMINAL to create all the links to the
script.
sudo update-rc.d solr defaults
Type in the following command in TERMINAL to make the script executable.
sudo chmod a+rx /etc/init.d/solr
To test. Reboot your Ubuntu Server.
Wait until Ubuntu Server reboot is completed.
Wait 2 minutes for Apache Solr to startup.
Using your internet browser go to your website and try a Solr search.



-Original Message-
From: Nikola Garafolic [mailto:nikola.garafo...@srce.hr] 
Sent: Monday, November 08, 2010 11:42 PM
To: solr-user@lucene.apache.org
Subject: solr init.d script

Hi,

Does anyone have some kind of init.d script for solr, that can start, 
stop and check solr status?

-- 
Nikola Garafolic
SRCE, Sveucilisni racunski centar
tel: +385 1 6165 804
email: nikola.garafo...@srce.hr



RE: Removing irrelevant URLS

2010-11-07 Thread Eric Martin
OK, thanks. I am using nutch and figuring out how to use urlfilters,
unsuccessfully. Just thought there might be a way I could save some trouble
this way. Thanks!

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Sunday, November 07, 2010 8:46 AM
To: solr-user@lucene.apache.org
Subject: Re: Removing irrelevant URLS

You can always do a delete-by-query, but that pre-supposes you can form
a query that would remove only those documents with URLs you want
removed... Assuming you do this, an optimize would then physically
remove the documents from your index (delete by query just marks
the docs as deleted).

Solr has nothing specifically for URLs, it's an engine rather than a web
crawling app

Best
Erick

On Fri, Nov 5, 2010 at 4:33 PM, Eric Martin e...@makethembite.com wrote:

 Hi,



 I have 100k URL's in my index. I specifically crawled sits relating to
law.
 However, during my intitial crawls I didn't specify urlfilters so I am
 stuck
 with extrinsic and often irrelevant URL's like twitter, etc.



 Is there some way in Solr that I can run periodic URL cleanings to remove
 URL's and search string results? Or, should I just dump my index and
 rebuild
 using the filter?



 I have looked on the Solr wiki and came across some candidates that look
 like it is what I am trying to accomplish but am not sure. If anyone knows
 where I should be looking I would appreciate it.



 Eric





Adding Carrot2

2010-11-07 Thread Eric Martin
Hi,

 

Solr and nutch have been working fine. I now want to integrate Carrot2. I
followed this tutorial/quickstart:
http://www.lucidimagination.com/blog/2009/09/28/solrs-new-clustering-capabil
ities/

 

I didn't see anything to adjust in my schema so I didn't do anything there.
I did add the code to the solrconfig.xml though. I am getting this when I
start Solr now:

 

Command: java -Dsolr.clustering.enabled=true -jar start.jar

 

Nov 7, 2010 11:35:16 AM org.apache.solr.common.SolrException log

SEVERE: java.lang.RuntimeException: [solrconfig.xml] requestHandler: missing
mandatory attribute 'class'

 

Anyone run into issues with Carrot2?

 

Eric



RE: Adding Carrot2

2010-11-07 Thread Eric Martin
Yeah I know, you have to download the libraries and copy them to your /lib 
inside of Solr. In Solr 1.4 the plugin is available but the libraries are not. 
http://www.lucidimagination.com/blog/2009/09/28/solrs-new-clustering-capabilities/

I think there is something wrong with the schema and solrconfig (xml's) 
integration. Some documentation on Apache says it's already written into the 
xml and some says its not. Searching the xml's in Solr I find no reference to 
clustering. Now that I think about it, I copied over the solrconfig.xml and 
schema.xml with my Drupal/ApacheSolr xml's.

I think I may have answered my own question as to why the clustering isn't 
running correctly. I will go get a copy of the default xml's and if I find it 
there, I will try and merge them. Does this sound I am on the right path now?

-Original Message-
From: Lance Norskog [mailto:goks...@gmail.com] 
Sent: Sunday, November 07, 2010 12:41 PM
To: solr-user@lucene.apache.org
Subject: Re: Adding Carrot2

Carrot is already part of the Solr distributions. 1.4.1 and 3.x and the trunk.

On 11/7/10, Eric Martin e...@makethembite.com wrote:
 Hi,



 Solr and nutch have been working fine. I now want to integrate Carrot2. I
 followed this tutorial/quickstart:
 http://www.lucidimagination.com/blog/2009/09/28/solrs-new-clustering-capabil
 ities/



 I didn't see anything to adjust in my schema so I didn't do anything there.
 I did add the code to the solrconfig.xml though. I am getting this when I
 start Solr now:



 Command: java -Dsolr.clustering.enabled=true -jar start.jar



 Nov 7, 2010 11:35:16 AM org.apache.solr.common.SolrException log

 SEVERE: java.lang.RuntimeException: [solrconfig.xml] requestHandler: missing
 mandatory attribute 'class'



 Anyone run into issues with Carrot2?



 Eric




-- 
Lance Norskog
goks...@gmail.com



Removing irrelevant URLS

2010-11-05 Thread Eric Martin
Hi,

 

I have 100k URL's in my index. I specifically crawled sits relating to law.
However, during my intitial crawls I didn't specify urlfilters so I am stuck
with extrinsic and often irrelevant URL's like twitter, etc. 

 

Is there some way in Solr that I can run periodic URL cleanings to remove
URL's and search string results? Or, should I just dump my index and rebuild
using the filter? 

 

I have looked on the Solr wiki and came across some candidates that look
like it is what I am trying to accomplish but am not sure. If anyone knows
where I should be looking I would appreciate it.

 

Eric



RE: Solr in virtual host as opposed to /lib

2010-11-01 Thread Eric Martin
I was speaking about apache virtual hosts. I was concerned that there was an 
increase processing time due to the solr and nutch instance being housed inside 
a virtual host as opposed to being dropped in root of my distro.

Thank you for the astute clarification.

-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu] 
Sent: Monday, November 01, 2010 9:52 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr in virtual host as opposed to /lib

I think you guys are talking about two different kinds of 'virtual 
hosts'.  Lance is talking about CPU virtualization. Eric appears to be 
talking about apache virtual web hosts, although Eric hasn't told us how 
apache is involved in his setup in the first place, so it's unclear.

Assuming you are using apache to reverse proxy to Solr, there is no 
reason I can think of that your front-end apache setup would effect CPU 
utilizaton by Solr, let alone by nutch.

Eric Martin wrote:
 Oh. So I should take out the installations and move them to /some_dir as 
 opposed to inside my virtual host of /home/my solr  nutch is here/www
 '

 -Original Message-
 From: Lance Norskog [mailto:goks...@gmail.com]
 Sent: Sunday, October 31, 2010 7:26 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Solr in virtual host as opposed to /lib

 With virtual hosting you can give CPU  memory quotas to your
 different VMs. This allows you to control the Nutch v.s. The World
 problem. Unforch, you cannot allocate disk channel. With two i/o bound
 apps, this is a problem.

 On Sun, Oct 31, 2010 at 4:38 PM, Eric Martin e...@makethembite.com wrote:
   
 Excellent information. Thank you. Solr is acting just fine then. I can
 connect to it no issues, it indexes fine and there didn't seem to be any
 complication with it. Now I can rule it out and go about solving, what you
 pointed out, and I agree, to be a java/nutch issue.

 Nutch is a crawler I use to feed URL's into Solr for indexing. Nutch is open
 source and found on apache.org

 Thanks for your time.

 -Original Message-
 From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
 Sent: Sunday, October 31, 2010 4:33 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Solr in virtual host as opposed to /lib

 What servlet container are you putting your Solr in? Jetty? Tomcat?
 Something else?  Are you fronting it with apache on top of that? (I think
 maybe you are, otherwise I'm not sure how the phrase 'virtual host'
 applies).

 In general, Solr of course doesn't care what directory it's in on disk, so
 long as the process running solr has the neccesary read/write permissions to
 the neccesary directories (and if it doesn't, you'd usually find out right
 away with an error message).  And clients to Solr don't care what directory
 it's in on disk either, they only care that they can get it to it connecting
 to a certain port at a certain hostname. In general, if they can't get to it
 on a certain port at a certain hostname, that's something you'd discover
 right away, not something that would be intermittent.  But I'm not familiar
 with nutch, you may want to try connecting to the port you have Solr running
 on (the hostname/port you have told nutch to find solr on?) yourself
 manually, and just make sure it is connectable.

 I can't think of any reason that what directory you have Solr in could cause
 CPU utilization issues. I think it's got nothing to do with that.

 I am not familar with nutch, if it's nutch that's taking 100% of your CPU,
 you might want to find some nutch experts to ask. Perhaps there's a nutch
 listserv?  I am also not familiar with hadoop; you mention just in passing
 that you're using hadoop too, maybe that's an added complication, I don't
 know.

 One obvious reason nutch could be taking 100% cpu would be simply because
 you've asked it to do a lot of work quickly, and it's trying to.

 One reason I have seen Solr take 100% of CPU and become responsive, is when
 the Solr process gets caught up in terrible Java garbage collection. If
 that's what's happening, then giving the Solr JVM a higher maximum heap size
 can sometimes help (although confusingly, I've seen people suggest that if
 you give the Solr JVM too MUCH heap it can also result in long GC pauses),
 and if you have a multi-core/multi-CPU machine, I've found the JVM argument
 -XX:+UseConcMarkSweepGC to be very helpful.

 Other than that, it sounds to me like you've got a nutch/hadoop issue, not a
 Solr issue.
 
 From: Eric Martin [e...@makethembite.com]
 Sent: Sunday, October 31, 2010 7:16 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Solr in virtual host as opposed to /lib

 Hi,

 Thank you. This is more than idle curiosity. I am trying to debug an issue I
 am having with my installation and this is one step in verifying that I have
 a setup that does not consume resources. I am trying to debunk my internal
 myth that having Solr nad Nutch in a virtual host would be causing these
 issues

RE: Solr in virtual host as opposed to /lib

2010-11-01 Thread Eric Martin
I don't think you read the entire thread. I'm assuming you made a mistake.

-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: Monday, November 01, 2010 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr in virtual host as opposed to /lib


: References: aanlktimvv5foc2b=gxo+xs1zwgps9o5t5jorwv3id...@mail.gmail.com
: aanlktim30aat8s0nxq_8utxcokv8myyabz8wtxeyl...@mail.gmail.com
: aanlktimpo9v_krgaxomd4hocqabibgzdhc+jhhgsq...@mail.gmail.com
: aanlktimdvaawj7=b7=pgu+rzm+nobvzdfh4o39nkp...@mail.gmail.com
: aanlktindzuwyjxwqqmtr5-rrp4gekvmj5vzzc_f0n...@mail.gmail.com
: In-Reply-To:
aanlktindzuwyjxwqqmtr5-rrp4gekvmj5vzzc_f0n...@mail.gmail.com
: Subject: Solr in virtual host as opposed to /lib

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is hidden in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking



-Hoss



Solr in virtual host as opposed to /lib

2010-10-31 Thread Eric Martin
Is there an issue running Solr in /home/lib as opposed to running it
somewhere outside of the virtual hosts like /lib?

Eric



RE: Solr in virtual host as opposed to /lib

2010-10-31 Thread Eric Martin
  fetcher.Fetcher - -activeThreads=50,
spinWaiting=49, fetchQueues.totalSize=2500
2010-10-31 15:44:21,360 INFO  fetcher.Fetcher - -activeThreads=50,
spinWaiting=49, fetchQueues.totalSize=2500
Can anyone help me out? Did I miss something should i be using Tomcat? One
interesting part of this is when I try and change the nutch setting post url
and urls by score to 1 they stay at 10 no matter what I do.

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Sunday, October 31, 2010 4:12 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr in virtual host as opposed to /lib

Can you expand on your question? Are you having a problem? Is this idle
curiosity?

Because I have no idea how to respond when there is so little information.

Best
Erick

On Sun, Oct 31, 2010 at 5:32 PM, Eric Martin e...@makethembite.com wrote:

 Is there an issue running Solr in /home/lib as opposed to running it
 somewhere outside of the virtual hosts like /lib?

 Eric





RE: Solr in virtual host as opposed to /lib

2010-10-31 Thread Eric Martin
Excellent information. Thank you. Solr is acting just fine then. I can
connect to it no issues, it indexes fine and there didn't seem to be any
complication with it. Now I can rule it out and go about solving, what you
pointed out, and I agree, to be a java/nutch issue.

Nutch is a crawler I use to feed URL's into Solr for indexing. Nutch is open
source and found on apache.org

Thanks for your time.

-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu] 
Sent: Sunday, October 31, 2010 4:33 PM
To: solr-user@lucene.apache.org
Subject: RE: Solr in virtual host as opposed to /lib

What servlet container are you putting your Solr in? Jetty? Tomcat?
Something else?  Are you fronting it with apache on top of that? (I think
maybe you are, otherwise I'm not sure how the phrase 'virtual host'
applies). 

In general, Solr of course doesn't care what directory it's in on disk, so
long as the process running solr has the neccesary read/write permissions to
the neccesary directories (and if it doesn't, you'd usually find out right
away with an error message).  And clients to Solr don't care what directory
it's in on disk either, they only care that they can get it to it connecting
to a certain port at a certain hostname. In general, if they can't get to it
on a certain port at a certain hostname, that's something you'd discover
right away, not something that would be intermittent.  But I'm not familiar
with nutch, you may want to try connecting to the port you have Solr running
on (the hostname/port you have told nutch to find solr on?) yourself
manually, and just make sure it is connectable. 

I can't think of any reason that what directory you have Solr in could cause
CPU utilization issues. I think it's got nothing to do with that. 

I am not familar with nutch, if it's nutch that's taking 100% of your CPU,
you might want to find some nutch experts to ask. Perhaps there's a nutch
listserv?  I am also not familiar with hadoop; you mention just in passing
that you're using hadoop too, maybe that's an added complication, I don't
know. 

One obvious reason nutch could be taking 100% cpu would be simply because
you've asked it to do a lot of work quickly, and it's trying to. 

One reason I have seen Solr take 100% of CPU and become responsive, is when
the Solr process gets caught up in terrible Java garbage collection. If
that's what's happening, then giving the Solr JVM a higher maximum heap size
can sometimes help (although confusingly, I've seen people suggest that if
you give the Solr JVM too MUCH heap it can also result in long GC pauses),
and if you have a multi-core/multi-CPU machine, I've found the JVM argument
-XX:+UseConcMarkSweepGC to be very helpful. 

Other than that, it sounds to me like you've got a nutch/hadoop issue, not a
Solr issue. 

From: Eric Martin [e...@makethembite.com]
Sent: Sunday, October 31, 2010 7:16 PM
To: solr-user@lucene.apache.org
Subject: RE: Solr in virtual host as opposed to /lib

Hi,

Thank you. This is more than idle curiosity. I am trying to debug an issue I
am having with my installation and this is one step in verifying that I have
a setup that does not consume resources. I am trying to debunk my internal
myth that having Solr nad Nutch in a virtual host would be causing these
issues. Here is the main issue that involves Nutch/Solr and Drupal:

/home/mootlaw/lib/solr
/home/mootlaw/lib/nutch
/home/mootlaw/www/Drupal site

I'm running a 1333 FSB Dual Socket Xeon 5500 Series @ 2.4ghz, Enterprise
Linux - x86_64 - OS, 12 Gig RAM. My Solr and Nutch are running. I am using
jetty for my Solr. My server is not rooted.

Nutch is using 100% of my cpus. I see this in my CPU utilization in my whm:

/usr/bin/java -Xmx1000m -Dhadoop.log.dir=/home/mootlaw/lib/nutch/logs
-Dhadoop.log.file=hadoop.log
-Djava.library.path=/home/mootlaw/lib/nutch/lib/native/Linux-amd64-64
-classpath
/home/mootlaw/lib/nutch/conf:/usr/lib/tools.jar:/home/mootlaw/lib/nutch/buil
d:/home/mootlaw/lib/nutch/build/test/classes:/home/mootlaw/lib/nutch/build/n
utch-1.2.job:/home/mootlaw/lib/nutch/nutch-*.job:/home/mootlaw/lib/nutch/lib
/apache-solr-core-1.4.0.jar:/home/mootlaw/lib/nutch/lib/apache-solr-solrj-1.
4.0.jar:/home/mootlaw/lib/nutch/lib/commons-beanutils-1.8.0.jar:/home/mootla
w/lib/nutch/lib/commons-cli-1.2.jar:/home/mootlaw/lib/nutch/lib/commons-code
c-1.3.jar:/home/mootlaw/lib/nutch/lib/commons-collections-3.2.1.jar:/home/mo
otlaw/lib/nutch/lib/commons-el-1.0.jar:/home/mootlaw/lib/nutch/lib/commons-h
ttpclient-3.1.jar:/home/mootlaw/lib/nutch/lib/commons-io-1.4.jar:/home/mootl
aw/lib/nutch/lib/commons-lang-2.1.jar:/home/mootlaw/lib/nutch/lib/commons-lo
gging-1.0.4.jar:/home/mootlaw/lib/nutch/lib/commons-logging-api-1.0.4.jar:/h
ome/mootlaw/lib/nutch/lib/commons-net-1.4.1.jar:/home/mootlaw/lib/nutch/lib/
core-3.1.1.jar:/home/mootlaw/lib/nutch/lib/geronimo-stax-api_1.0_spec-1.0.1.
jar:/home/mootlaw/lib/nutch/lib/hadoop-0.20.2-core.jar:/home/mootlaw/lib

RE: Solr in virtual host as opposed to /lib

2010-10-31 Thread Eric Martin
Oh. So I should take out the installations and move them to /some_dir as 
opposed to inside my virtual host of /home/my solr  nutch is here/www
'
 
-Original Message-
From: Lance Norskog [mailto:goks...@gmail.com] 
Sent: Sunday, October 31, 2010 7:26 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr in virtual host as opposed to /lib

With virtual hosting you can give CPU  memory quotas to your
different VMs. This allows you to control the Nutch v.s. The World
problem. Unforch, you cannot allocate disk channel. With two i/o bound
apps, this is a problem.

On Sun, Oct 31, 2010 at 4:38 PM, Eric Martin e...@makethembite.com wrote:
 Excellent information. Thank you. Solr is acting just fine then. I can
 connect to it no issues, it indexes fine and there didn't seem to be any
 complication with it. Now I can rule it out and go about solving, what you
 pointed out, and I agree, to be a java/nutch issue.

 Nutch is a crawler I use to feed URL's into Solr for indexing. Nutch is open
 source and found on apache.org

 Thanks for your time.

 -Original Message-
 From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
 Sent: Sunday, October 31, 2010 4:33 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Solr in virtual host as opposed to /lib

 What servlet container are you putting your Solr in? Jetty? Tomcat?
 Something else?  Are you fronting it with apache on top of that? (I think
 maybe you are, otherwise I'm not sure how the phrase 'virtual host'
 applies).

 In general, Solr of course doesn't care what directory it's in on disk, so
 long as the process running solr has the neccesary read/write permissions to
 the neccesary directories (and if it doesn't, you'd usually find out right
 away with an error message).  And clients to Solr don't care what directory
 it's in on disk either, they only care that they can get it to it connecting
 to a certain port at a certain hostname. In general, if they can't get to it
 on a certain port at a certain hostname, that's something you'd discover
 right away, not something that would be intermittent.  But I'm not familiar
 with nutch, you may want to try connecting to the port you have Solr running
 on (the hostname/port you have told nutch to find solr on?) yourself
 manually, and just make sure it is connectable.

 I can't think of any reason that what directory you have Solr in could cause
 CPU utilization issues. I think it's got nothing to do with that.

 I am not familar with nutch, if it's nutch that's taking 100% of your CPU,
 you might want to find some nutch experts to ask. Perhaps there's a nutch
 listserv?  I am also not familiar with hadoop; you mention just in passing
 that you're using hadoop too, maybe that's an added complication, I don't
 know.

 One obvious reason nutch could be taking 100% cpu would be simply because
 you've asked it to do a lot of work quickly, and it's trying to.

 One reason I have seen Solr take 100% of CPU and become responsive, is when
 the Solr process gets caught up in terrible Java garbage collection. If
 that's what's happening, then giving the Solr JVM a higher maximum heap size
 can sometimes help (although confusingly, I've seen people suggest that if
 you give the Solr JVM too MUCH heap it can also result in long GC pauses),
 and if you have a multi-core/multi-CPU machine, I've found the JVM argument
 -XX:+UseConcMarkSweepGC to be very helpful.

 Other than that, it sounds to me like you've got a nutch/hadoop issue, not a
 Solr issue.
 
 From: Eric Martin [e...@makethembite.com]
 Sent: Sunday, October 31, 2010 7:16 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Solr in virtual host as opposed to /lib

 Hi,

 Thank you. This is more than idle curiosity. I am trying to debug an issue I
 am having with my installation and this is one step in verifying that I have
 a setup that does not consume resources. I am trying to debunk my internal
 myth that having Solr nad Nutch in a virtual host would be causing these
 issues. Here is the main issue that involves Nutch/Solr and Drupal:

 /home/mootlaw/lib/solr
 /home/mootlaw/lib/nutch
 /home/mootlaw/www/Drupal site

 I'm running a 1333 FSB Dual Socket Xeon 5500 Series @ 2.4ghz, Enterprise
 Linux - x86_64 - OS, 12 Gig RAM. My Solr and Nutch are running. I am using
 jetty for my Solr. My server is not rooted.

 Nutch is using 100% of my cpus. I see this in my CPU utilization in my whm:

 /usr/bin/java -Xmx1000m -Dhadoop.log.dir=/home/mootlaw/lib/nutch/logs
 -Dhadoop.log.file=hadoop.log
 -Djava.library.path=/home/mootlaw/lib/nutch/lib/native/Linux-amd64-64
 -classpath
 /home/mootlaw/lib/nutch/conf:/usr/lib/tools.jar:/home/mootlaw/lib/nutch/buil
 d:/home/mootlaw/lib/nutch/build/test/classes:/home/mootlaw/lib/nutch/build/n
 utch-1.2.job:/home/mootlaw/lib/nutch/nutch-*.job:/home/mootlaw/lib/nutch/lib
 /apache-solr-core-1.4.0.jar:/home/mootlaw/lib/nutch/lib/apache-solr-solrj-1.
 4.0.jar:/home/mootlaw/lib/nutch/lib/commons-beanutils

Basic Document Question

2010-10-30 Thread Eric Martin
HI everyone,

 

I'm new which won't be hard to figure out after I ask this question:

 

I use Drupal/Solr/Nutch

 

http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/solr/conf/schema.
xml?view=markup

 

Solr specific:

How do I re-index for specific content only? I am starting  a legal index
specifically geared for law students and lawyers. I am crawling law related
sites but I really don't want to index law firms, just the law content on
places like:

http://www.ecasebriefs.com/blog/law/

http://www.lawnix.com/cases/cases-index/

http://www.oyez.org/

http://www.4lawnotes.com/

http://www.docstoc.com/documents/education/law-school/case-briefs

http://www.lawschoolcasebriefs.com/

http://dictionary.findlaw.com http://dictionary.findlaw.com/ 

 

As I was saying, while crawling I get all kinds of extrinsic information put
into the Solr index. How do I combat that?

 

I am assuming (cough) that I can do this but I am really at a loss as to
where I start to look to get this done. I prefer to learn and I defiantly
don't want to waste anyone's time.

 

Non-Solr Specific

Does anyone here help with nutch or is this Solr only?

 

I am sorry if I am asking elementary questions and am asking in the wrong
place. I just need to be pointed to the right place. I'm sort of
lost.(imagine that.) 

 

Thanks

 

Eric

 

 

 



RE: Does anyone notice this site?

2010-10-25 Thread Eric Martin
This is not legal advice. Take this as it is. Just off my head and what I
know. I did not research this, but could, if Solr wants me to.

From a marketing stand-point, probably. 

From a legal standpoint. They can do whatever they want with the name Solr
so long as they maintain a distance between any trademarked name and the
fundamental use of the trademark, unless there is  substantial connection
between the trademark name and recognition. Of course, that is to be
determined by a few factors, length in business, trademarks carried, whether
or not the offending trademark makes a claim (not making a claim limits your
recovery substantially and may even null it.). They are also in South
Africa. So, throw in international law.

Of course, you also have fair use law. Well, this can get tricky. Here is an
example: myspace.com and moremyspace.com. If moremysapce.com is used as a
social networking site than myspace has a claim. If it is used as a social
networking site in parody then mysapce has no legal claim whatsoever.

Another example is booble.com (not work safe link!) That case lasted many
years and google lost. 

Trademarks are a very tricky business and one that I will never practice.
Anyway, seeing as how they are making a search engine, they are using a
lower level FQDN and they have not made a dent in the industry it would be
futile to do anything but send them an email laying cliam to the name Solr.

*If you do not send them a letter/email laying claim to Solr you will lose
your rights to fight that battle with IANA, etc or the ability to seek legal
remedy.*

Eric
Law Student - Second Year



-Original Message-
From: scott chu [mailto:scott@udngroup.com] 
Sent: Monday, October 25, 2010 9:55 AM
To: solr-user@lucene.apache.org
Subject: Does anyone notice this site?

I happen to bump into this site: http://www.solr.biz/

They said they are also developing a search engine? Is this any connection 
to open source Solr? 



Integrating Carrot2/Solr Deafult Example

2010-10-24 Thread Eric Martin
Hello,

 

Welcome to all. I am a very basic user. I have limited knowledge. I read the
documentation, I have an 'example' Solr installation working on my server. I
have Drupal 6. I have Drupal using Solr (apachesolr) as its default search
engine. I have 1 document in the database that is searchable for testing
purposes. I would like to know, if I am using all default paths in my Solr
installation, how do I enable Carrot2? Once enabled, how do I verify that it
is clustering properly?

 

Carrot2 doc I read:
http://download.carrot2.org/head/manual/index.html#chapter.application-suite

Clustering Wiki Solr I read: http://wiki.apache.org/solr/ClusteringComponent

 

I know this is really basic stuff and I really appreciate the help. I
fumbled my way through installing Solr on my own, setting up Drupal, etc. I
am a former Natural V2 3270 programmer (basic flat file OO) and have limited
experience in PHP, Java, Jetty etc. However, I can read code, decipher what
it is doing, and find a solution and then implement it. I just really have
no foundation for Carrot2/Solr, yet.

 

 Any help, pointers and look here's would very much be appreciated. 

 

Eric Martin