Re: Replication snapshot, tar says file changed as we read it

2011-03-24 Thread Andrew Clegg
Sorry to re-open an old thread, but this just happened to me again,
even with a 30 second sleep between taking the snapshot and starting
to tar it up. Then, even more strangely, the snapshot was removed
again before tar completed.

Archiving snapshot.20110320113401 into
/var/www/mesh/backups/weekly.snapshot.20110320113401.tar.bz2
tar: snapshot.20110320113401/_neqv.fdt: file changed as we read it
tar: snapshot.20110320113401/_neqv.prx: File removed before we read it
tar: snapshot.20110320113401/_neqv.fnm: File removed before we read it
tar: snapshot.20110320113401: Cannot stat: No such file or directory
tar: Exiting with failure status due to previous errors

Has anybody seen this before, or been able to replicate it themselves?
(no pun intended)

Or, is anyone else using replication snapshots for backup? Have I
misunderstood them? I thought the point of a snapshot was that once
taken it was immutable.

If it's important, this is on a machine configured as a replication
master, but with no slaves attached to it (it's basically a failover
and backup machine).

  requestHandler name=/replication class=solr.ReplicationHandler 
  lst name=master
  str name=replicateAfterstartup/str
  str name=replicateAftercommit/str
  str 
name=confFilesadmin-extra.html,elevate.xml,protwords.txt,schema.xml,scripts.conf,solrconfig_slave.xml:solrconfig.xml,stopwords.txt,synonyms.txt/str
  str name=commitReserveDuration00:00:10/str
  /lst
  /requestHandler

Thanks,

Andrew.


On 16 January 2011 12:55, Andrew Clegg andrew.cl...@gmail.com wrote:
 PS one other point I didn't mention is that this server has a very
 fast autocommit limit (2 seconds max time).

 But I don't know if this is relevant -- I thought the files in the
 snapshot wouldn't be committed to again. Please correct me if this is
 a huge misunderstanding.

 On 16 January 2011 12:30, Andrew Clegg andrew.cl...@gmail.com wrote:
 (Many apologies if this appears twice, I tried to send it via Nabble
 first but it seems to have got stuck, and is fairly urgent/serious.)

 Hi,

 I'm trying to use the replication handler to take snapshots, then
 archive them and ship them off-site.

 Just now I got a message from tar that worried me:

 tar: snapshot.20110115035710/_70b.tis: file changed as we read it
 tar: snapshot.20110115035710: file changed as we read it

 The relevant bit of script that does it looks like this (error
 checking removed):

 curl 'http://localhost:8983/solr/core/1replication?command=backup'
 PREFIX=''
 if [[ $START_TIME =~ 'Sun' ]]
 then
        PREFIX='weekly.'
 fi
 cd $SOLR_DATA_DIR
 for snapshot in `ls -d -1 snapshot.*`
 do
        TARGET=${LOCAL_BACKUP_DIR}/${PREFIX}${snapshot}.tar.bz2
        echo Archiving ${snapshot} into $TARGET
        tar jcf $TARGET $snapshot
        echo Deleting ${snapshot}
        rm -rf $snapshot
 done

 I was under the impression that files in the snapshot were guaranteed
 to never change, right? Otherwise what's the point of the replication
 backup command?

 I tried putting in a 30-second sleep after the snapshot and before the
 tar, but the error occurred again anyway.

 There was a message from Lance N. with a similar error in, years ago:

 http://www.mail-archive.com/solr-user@lucene.apache.org/msg06104.html

 but that would be pre-replication anyway, right?

 This is on Ubuntu 10.10 using java 1.6.0_22 and Solr 1.4.0.

 Thanks,

 Andrew.


 --

 :: http://biotext.org.uk/ :: http://twitter.com/andrew_clegg/ ::




 --

 :: http://biotext.org.uk/ :: http://twitter.com/andrew_clegg/ ::




-- 

:: http://biotext.org.uk/ :: http://twitter.com/andrew_clegg/ ::


Re: which German stemmer to use?

2011-03-24 Thread Paul Libbrecht
In our ActiveMath project, we have had positive feedback in Lucene with the 
 SnowBallAnalyzer(Version.LUCENE_29,German) 
which is probably one of the two below.

I note that you may want to be careful to use one field with exact matching 
(e.g. whitespace analyzer and lowercase filter) an done field with stemmed 
matches. That's two fields in the index and a query-expansion mechanism such as 
dismax to

  text-de^2.0 text-de.stemmed^1.2
(add the phonetic...)

One of the biggest issues that our testers formulated is that compound words 
should be split. I believe this issue is also very present in technology texts. 
Thus far only the compound-words analyzer can do such a split and you need the 
compounds to be manually input. Maybe that's doable?

paul


Le 24 mars 2011 à 00:14, Christopher Bottaro a écrit :

 The wiki lists 5 available, but doesn't do a good job at explaining or
 recommending one:
 
 GermanStemFilterFactory
 SnowballPorterFilterFactory (German)
 SnowballPorterFilterFactory (German2)
 GermanLightStemFilterFactory
 GermanMinimalStemFilterFactory
 
 Which is the best one to use in general?  Which is the best to use when the
 content being indexed is German technology articles?
 
 Thanks for the help.



Re: Problem with field collapsing of patched Solr 1.4

2011-03-24 Thread Kai Schlamp-2

Afroz Ahmad wrote:
 
 Have you enabled the collapse component in solconfig.xml?
 
 lt;searchComponent name=quot;queryquot;
 class=quot;org.apache.solr.handler.component.CollapseComponentquot;
 /gt;
 

No, it seems that I missed that completely. Thank you, Afroz. It works fine
now.

Kai


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-with-field-collapsing-of-patched-Solr-1-4-tp2678850p2724321.html
Sent from the Solr - User mailing list archive at Nabble.com.


boosting with standard search handler

2011-03-24 Thread Gastone Penzo
Hi,
is possibile to boost fields like bf parameter of dismax in standard request
handler?
with or without funcions?

thanx

-- 
Gastone Penzo

*www.solr-italia.it*
*The first italian blog about Apache Solr*


Re: boosting with standard search handler

2011-03-24 Thread Tommaso Teofili
Hi Gastone,
I used to do that in standard search handler using the following parameters:
q={!boost b=query($qq,0.7)} text:something title:other
qq=date:[NOW-60DAY TO NOW]^5 OR date:[NOW-15DAY TO NOW]^8
that enabling custom recency based boosting.
My 2 cents,
Tommaso


2011/3/24 Gastone Penzo gastone.pe...@gmail.com

 Hi,
 is possibile to boost fields like bf parameter of dismax in standard
 request handler?
 with or without funcions?

 thanx

 --
 Gastone Penzo

 *www.solr-italia.it*
 *The first italian blog about Apache Solr*



Re: boosting with standard search handler

2011-03-24 Thread Gastone Penzo
Thank you Tommaso..
your solution works.
i read there's another methor, using _val_ parameter.

Thank

Gastone

2011/3/24 Tommaso Teofili tommaso.teof...@gmail.com

 Hi Gastone,
 I used to do that in standard search handler using the following
 parameters:
 q={!boost b=query($qq,0.7)} text:something title:other
 qq=date:[NOW-60DAY TO NOW]^5 OR date:[NOW-15DAY TO NOW]^8
 that enabling custom recency based boosting.
 My 2 cents,
 Tommaso


 2011/3/24 Gastone Penzo gastone.pe...@gmail.com

  Hi,
  is possibile to boost fields like bf parameter of dismax in standard
  request handler?
  with or without funcions?
 
  thanx
 
  --
  Gastone Penzo
 
  *www.solr-italia.it*
  *The first italian blog about Apache Solr*
 




-- 
Gastone Penzo


Re: Why boost query not working?

2011-03-24 Thread Ahmet Arslan


--- On Thu, 3/24/11, cyang2010 ysxsu...@hotmail.com wrote:

 This solr query faile:
 1. get every title regardless what the title_name is
 2. within the result, boost the one which genre id =
 56.  (bq=genres:56^100)
 
 http://localhost:8983/solr/titles/select?indent=onversion=2.2start=0rows=10fl=*%2Cscorewt=standarddefType=dismaxqf=title_name_en_USq=*%3A*bq=genres%3A56^100debugQuery=on
 
 
 But from debug i can tell it confuse the boost query
 parameter as part of
 query string:
 
 lst name=debug
 str name=rawquerystring*:*
 str name=querystring*:*
 str name=parsedquery+() () genres:56^100.0
 str name=parsedquery_toString+() () genres:56^100.0
 lst name=explain/
 str name=QParserDisMaxQParser
 null name=altquerystring/
 −
 arr name=boost_queries
 strgenres:56^100
 /arr

With dismax, you cannot use semicolon or field queries. Instead of q=*:* you 
can try q.alt=*:* (do not use q parameter at all)





invert terms in search with exact match

2011-03-24 Thread Gastone Penzo
Hi,
is it possible with standard query search (not dismax) to have
exact matches that allow any terms order?

for example:

if i search my love i would solr gives to me docs with

- my love
- love my

it's easy: q=title:(my AND love)

the problem is it returns also docs with

my love is my dog

i don't want this. i want only docs with title formed by these 2 terms: my
and love.

is it possible??

thanx

-- 
Gastone Penzo
*www.solr-italia.it*
*The first italian blog about Apache Solr*


Re: invert terms in search with exact match

2011-03-24 Thread Ahmet Arslan


--- On Thu, 3/24/11, Gastone Penzo gastone.pe...@gmail.com wrote:

 From: Gastone Penzo gastone.pe...@gmail.com
 Subject: invert terms in search with exact match
 To: solr-user@lucene.apache.org
 Date: Thursday, March 24, 2011, 3:58 PM
 Hi,
 is it possible with standard query search (not dismax) to
 have
 exact matches that allow any terms order?
 
 for example:
 
 if i search my love i would solr gives to me docs with
 
 - my love
 - love my
 
 it's easy: q=title:(my AND love)
 
 the problem is it returns also docs with
 
 my love is my dog
 
 i don't want this. i want only docs with title formed by
 these 2 terms: my
 and love.

PhraseQuery has an interesting property. If you don't use slop value (means 
zero) it is ordered phrase query. However starting from 1, it is un-ordered.

my love~1 will somehow satisfy you.  If really want my love to be unordered 
you can try solr-1604.


  


Re: how to run boost query for non-dismax query parser

2011-03-24 Thread Ahmet Arslan

 I need to code some boosting logic when some field equal to
 some value.   I
 was able to get it work if using dismax query parser. 
 However, since the
 solr query will need to handle prefix or fuzzy query,
 therefore, dismax
 query parser is not really my choice.  
 
 Therefore, i want to use standard query parser, but still
 have dismax's
 boosting query logic.  For example, this query return
 all the titles
 regardless what the value is, however, will boost the score
 of those which
 genres=5237:
 
 http://localhost:8983/solr/titles/select?indent=onstart=0rows=10fl=*%2Cscorewt=standardexplainOther=hl.fl=qt=standardq={!boost%20b=genres:5237^2.2}*%3A*debugQuery=on
 
 
 Here is the exception i get:
 HTTP ERROR: 400
 
 org.apache.lucene.queryParser.ParseException: Expected ','
 at position 6 in
 'genres:5237^2.2'

BoostingQParserPlugin takes a FunctionQuery. In your case it is lucene/solr 
query. If you want to boost by solr/lucene query, you can add that clause as 
optional clause. Thats all.

q=+*:* genres:5237^2.2q.op=OR  will do the trick. Just make sure that you are 
using OR as a default operator.





Re: invert terms in search with exact match

2011-03-24 Thread Tommaso Teofili
Hi Gastone,
I think you should use proximity search as described here in Lucene query
syntax page [1].
So searching for my love~2 should work for your use case.
Cheers,
Tommaso

[1] : 
http://lucene.apache.org/java/2_9_3/queryparsersyntax.html#ProximitySearches

2011/3/24 Gastone Penzo gastone.pe...@gmail.com

 Hi,
 is it possible with standard query search (not dismax) to have
 exact matches that allow any terms order?

 for example:

 if i search my love i would solr gives to me docs with

 - my love
 - love my

 it's easy: q=title:(my AND love)

 the problem is it returns also docs with

 my love is my dog

 i don't want this. i want only docs with title formed by these 2 terms: my
 and love.

 is it possible??

 thanx

 --
 Gastone Penzo
 *www.solr-italia.it*
 *The first italian blog about Apache Solr*



Question about http://wiki.apache.org/solr/Deduplication

2011-03-24 Thread eks dev
Hi,
Use case I am trying to figure out is about preserving IDs without
re-indexing on duplicate, rather adding this new ID under list of
document id aliases.

Example:
Input collection:
id:1, text:dummy text 1, signature:A
id:2, text:dummy text 1, signature:A

I add the first document in empty index, text is going to be indexed,
ID is going to be 1, so far so good

Now the question, if I add second document with id == 2, instead of
deleting/indexing this new document, I would like to store id == 2 in
multivalued Field id

At the end, I would have one document less indexed and both ID are
going to be searchable (and stored as well)...

Is it possible in solr to have multivalued id? Or I need to make my
own mv_ID for this? Any ideas how to achieve this efficiently?

My target is not to add new documents if signature matches, but to
have IDs indexed and stored?

Thanks,
eks


Detecting an empty index during start-up

2011-03-24 Thread David McLaughlin
Hi,

In our Solr deployment we have a cluster of replicated Solr cores, with the
small change that we have dynamic master look-up using ZooKeeper. The
problem I am trying to solve is to make sure that when a new Solr core joins
the cluster it isn't made available to any search services until it has been
filled with data.

I am not familiar with Solr internals, so the approach I wanted to take was
to basically check the numDocs property of the index during start-up and set
a READABLE state in the ZooKeeper node if it's greater than 0. I also
planned to create a commit hook for replication and updating which
controlled the READABLE property based on numDocs also.

This just leaves the problem of finding out the number of documents during
start-up. I planned to have something like:

int numDocs = 0;
RefCountedSolrIndexSearcher searcher = core.getSearcher();
try {
   numDocs = searcher.get().getIndexReader().numDocs();
} finally {
searcher.decref();
}

but getSearcher's documentation specifically says don't use it from the
inform method. I missed this at first and of course I got a deadlock
(although only when I had more than one core on the same Solr instance).

Is there a simpler way to do what I want? Or will I just need to have a
thread which waits until the Searcher is available before setting the
state?

Thanks,
David


Re: invert terms in search with exact match

2011-03-24 Thread Gastone Penzo
Hi Tommaso,
thank you for the answer but the problem in your solution is that solr
returns to me
also docs with other words. For example:

my love is the world

i want to exclude the other words.
it must give to me only docs with my love or love my. stop

Thank you

2011/3/24 Tommaso Teofili tommaso.teof...@gmail.com

 Hi Gastone,
 I think you should use proximity search as described here in Lucene query
 syntax page [1].
 So searching for my love~2 should work for your use case.
 Cheers,
 Tommaso

 [1] : 
 http://lucene.apache.org/java/2_9_3/queryparsersyntax.html#ProximitySearches

 2011/3/24 Gastone Penzo gastone.pe...@gmail.com

 Hi,
 is it possible with standard query search (not dismax) to have
 exact matches that allow any terms order?

 for example:

 if i search my love i would solr gives to me docs with

 - my love
 - love my

 it's easy: q=title:(my AND love)

 the problem is it returns also docs with

 my love is my dog

 i don't want this. i want only docs with title formed by these 2 terms: my
 and love.

 is it possible??

 thanx

 --
 Gastone Penzo
 *www.solr-italia.it*
 *The first italian blog about Apache Solr*





-- 
Gastone Penzo
*www.solr-italia.it*
*The first italian blog about Apache Solr*


Re: invert terms in search with exact match

2011-03-24 Thread Bill Bell
Yes create qt with dismax and qf on field that has query stopwords for the 
words you want to ignore.

Bill Bell
Sent from mobile


On Mar 24, 2011, at 7:58 AM, Gastone Penzo gastone.pe...@gmail.com wrote:

 Hi,
 is it possible with standard query search (not dismax) to have
 exact matches that allow any terms order?
 
 for example:
 
 if i search my love i would solr gives to me docs with
 
 - my love
 - love my
 
 it's easy: q=title:(my AND love)
 
 the problem is it returns also docs with
 
 my love is my dog
 
 i don't want this. i want only docs with title formed by these 2 terms: my
 and love.
 
 is it possible??
 
 thanx
 
 -- 
 Gastone Penzo
 *www.solr-italia.it*
 *The first italian blog about Apache Solr*


Re: invert terms in search with exact match

2011-03-24 Thread Gastone Penzo
no beacuse i don't know the words i want to ignore.. and i don't want use
dismax.
i have to use standard handler.

the problem is very simple. i want to recive only documents that have in
title field ONLY the words i search,
in any order.

if i search my love darling, i want solr returns me these possilbe titles:

title1: my love darling
title2: my darling love
title3: darling my love
title4: love my darling
.

all the combinations of these 3 words. others words have to be ignored

thanx


2011/3/24 Bill Bell billnb...@gmail.com

 Yes create qt with dismax and qf on field that has query stopwords for the
 words you want to ignore.

 Bill Bell
 Sent from mobile


 On Mar 24, 2011, at 7:58 AM, Gastone Penzo gastone.pe...@gmail.com
 wrote:

  Hi,
  is it possible with standard query search (not dismax) to have
  exact matches that allow any terms order?
 
  for example:
 
  if i search my love i would solr gives to me docs with
 
  - my love
  - love my
 
  it's easy: q=title:(my AND love)
 
  the problem is it returns also docs with
 
  my love is my dog
 
  i don't want this. i want only docs with title formed by these 2 terms:
 my
  and love.
 
  is it possible??
 
  thanx
 
  --
  Gastone Penzo
  *www.solr-italia.it*
  *The first italian blog about Apache Solr*




-- 
Gastone Penzo
*www.solr-italia.it*
*The first italian blog about Apache Solr*


Re: dismax parser, parens, what do they do exactly

2011-03-24 Thread Jonathan Rochkind
Thanks Hoss, this is very helpful, okay, dismax is not intended to do 
anything with parens for semantics, they're just like any other char, 
handled by analyzers.


I think you're right I cut and paste the wrong query before. Just for 
the record, on 1.4.1:


qf=text
pf=
q=book (dog +(cat -frog))

str name=parsedquery
+((DisjunctionMaxQuery((text:book)~0.01) 
DisjunctionMaxQuery((text:dog)~0.01) 
DisjunctionMaxQuery((text:cat)~0.01) 
-DisjunctionMaxQuery((text:frog)~0.01))~3) ()

/str

str name=parsedquery_toString
+(((text:book)~0.01 (text:dog)~0.01 (text:cat)~0.01 -(text:frog)~0.01)~3) ()
/str




Re: invert terms in search with exact match

2011-03-24 Thread Jonathan Rochkind
You can use query slop as others have said to find documents with my 
and love right next to each other, in any order. And I think query 
slop can probably work for three or more words too to do that.


But it won't find files with ONLY those words in it. For instance my 
love~2 will still match:


love my something else
something my love else
other love my

etc.

Solr isn't so good at doing exact matches in general, although there 
are some techniques to set up your index and queries to do actual 
exact (entire field) matches -- mostly putting fake tokens like 
$BEGIN and $END at the beginning and end of your indexed values, and 
then doing a phrase search which puts those tokens at begin and end too.


But I'm not sure if you can extend that technique to find exactly the 
words in _any_ order, instead of just the exact exact phrase. Maybe 
somehow using phrase slop?  It gets confusing to think about, I'm not sure.


On 3/24/2011 10:52 AM, Gastone Penzo wrote:

no beacuse i don't know the words i want to ignore.. and i don't want use
dismax.
i have to use standard handler.

the problem is very simple. i want to recive only documents that have in
title field ONLY the words i search,
in any order.

if i search my love darling, i want solr returns me these possilbe titles:

title1: my love darling
title2: my darling love
title3: darling my love
title4: love my darling
.

all the combinations of these 3 words. others words have to be ignored

thanx


2011/3/24 Bill Bellbillnb...@gmail.com


Yes create qt with dismax and qf on field that has query stopwords for the
words you want to ignore.

Bill Bell
Sent from mobile


On Mar 24, 2011, at 7:58 AM, Gastone Penzogastone.pe...@gmail.com
wrote:


Hi,
is it possible with standard query search (not dismax) to have
exact matches that allow any terms order?

for example:

if i search my love i would solr gives to me docs with

- my love
- love my

it's easy: q=title:(my AND love)

the problem is it returns also docs with

my love is my dog

i don't want this. i want only docs with title formed by these 2 terms:

my

and love.

is it possible??

thanx

--
Gastone Penzo
*www.solr-italia.it*
*The first italian blog about Apache Solr*





Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Brandon Waterloo
Hello everyone,

I've been trying for several hours now to set up Solr with multiple cores with 
Solr Cell working on each core. The only items being indexed are PDF, DOC, and 
TXT files (with the possibility of expanding this list, but for now, just 
assume the only things in the index should be documents).

I never had any problems with Solr Cell when I was using a single core. In 
fact, I just ran the default installation in example/ and worked from that. 
However, trying to migrate to multi-core has been a never ending list of 
problems.

Any time I try to add a document to the index (using the same curl command as I 
did to add to the single core, of course adding the core name to the request 
URL-- host/solr/corename/update/extract...), I get HTTP 500 errors due to 
classes not being found and/or lazy loading errors. I've copied the exact 
example/lib directory into the cores, and that doesn't work either.

Frankly the only libraries I want are those relevant to indexing files. The 
less bloat, the better, after all. However, I cannot figure out where to put 
what files, and why the example installation works perfectly for single-core 
but not with multi-cores.

Here is an example of the errors I'm receiving:

command prompt curl 
host/solr/core0/update/extract?literal.id=2-3-1commit=true -F 
myfile=@test2.txt

html
head
meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/
titleError 500 /title
/head
bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException

java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:240)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Caused by: java.lang.ClassNotFoundException: 
org.apache.tika.exception.TikaException
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 27 more
/pre
pRequestURI=/solr/core0/update/extract/ppismalla 
href=http://jetty.mortbay.org/;Powered by Jetty:///a/small/i/pbr/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/

/body
/html

Any assistance you could provide or installation guides/tutorials/etc. that you 
could link me to would be greatly appreciated. Thank you all for your time!

~Brandon Waterloo



Re: Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Markus Jelsma
Sounds like the Tika jar is not on the class path. Add it to a directory where 
Solr's looking for libs.

On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
 Hello everyone,
 
 I've been trying for several hours now to set up Solr with multiple cores
 with Solr Cell working on each core. The only items being indexed are PDF,
 DOC, and TXT files (with the possibility of expanding this list, but for
 now, just assume the only things in the index should be documents).
 
 I never had any problems with Solr Cell when I was using a single core. In
 fact, I just ran the default installation in example/ and worked from
 that. However, trying to migrate to multi-core has been a never ending
 list of problems.
 
 Any time I try to add a document to the index (using the same curl command
 as I did to add to the single core, of course adding the core name to the
 request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors
 due to classes not being found and/or lazy loading errors. I've copied the
 exact example/lib directory into the cores, and that doesn't work either.
 
 Frankly the only libraries I want are those relevant to indexing files. The
 less bloat, the better, after all. However, I cannot figure out where to
 put what files, and why the example installation works perfectly for
 single-core but not with multi-cores.
 
 Here is an example of the errors I'm receiving:
 
 command prompt curl
 host/solr/core0/update/extract?literal.id=2-3-1commit=true -F
 myfile=@test2.txt
 
 html
 head
 meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/
 titleError 500 /title
 /head
 bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException
 
 java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:247)
 at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:
 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
 org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH
 andler(RequestHandlers.java:240) at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque
 st(RequestHandlers.java:231) at
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
 :338) at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
 a:241) at
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
 er.java:1089) at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
 at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216
 ) at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
 at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
 at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
 at
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo
 llection.java:211) at
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
 114) at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
 at org.mortbay.jetty.Server.handle(Server.java:285)
 at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
 at
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav
 a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
 at
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:
 226) at
 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java
 :442) Caused by: java.lang.ClassNotFoundException:
 org.apache.tika.exception.TikaException at
 java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
 ... 27 more
 /pre
 pRequestURI=/solr/core0/update/extract/ppismalla
 href=http://jetty.mortbay.org/;Powered by
 Jetty:///a/small/i/pbr/ br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 
 /body
 /html
 
 Any assistance you could provide or installation guides/tutorials/etc. that
 you could link me to would be greatly appreciated. Thank you all for your
 time!
 
 ~Brandon Waterloo

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350


Re: invert terms in search with exact match

2011-03-24 Thread Dario Rigolin
On Thursday, March 24, 2011 03:52:31 pm Gastone Penzo wrote:

 
 title1: my love darling
 title2: my darling love
 title3: darling my love
 title4: love my darling

Sorry but simply search for:


 title:( my OR love OR darling) 
If you have default operator OR you don't need to put OR on the query

Best regards.
Dario Rigolin
Comperio srl (Italy)


Re: invert terms in search with exact match

2011-03-24 Thread Gastone Penzo
yes sorry i made  a mistake

title(my AND love AND darling)

all three words have to match. the problem is always i don't want results
with other words.


2011/3/24 Dario Rigolin dario.rigo...@comperio.it

 On Thursday, March 24, 2011 03:52:31 pm Gastone Penzo wrote:

 
  title1: my love darling
  title2: my darling love
  title3: darling my love
  title4: love my darling

 Sorry but simply search for:


  title:( my OR love OR darling)
 If you have default operator OR you don't need to put OR on the query

 Best regards.
 Dario Rigolin
 Comperio srl (Italy)




-- 
Gastone Penzo
*www.solr-italia.it*
*The first italian blog about Apache Solr*


Wanted: a directory of quick-and-(not too)dirty analyzers for multi-language RDF.

2011-03-24 Thread fr . jurain
Hello Solrists,
 
As it says in the subject line, I'm looking for a Java component that,
given an ISO 639-1 code or some equivalent,
would return a Lucene Analyzer ready to gobble documents in the corresponding 
language.
Solr looks like it has to contain one,
only I've not been able to locate it so far; 
can you point the spot?
 
I've found org.apache.solr.analysis,
and thing like org.apache.lucene.analysis.bg c in lucene/modules,
with many classes which I'm sure are related, however the factory itself still 
eludes me;
I mean the Java class.method that'd decide on request, what to do with all 
these packages
to bring the requisite object to existence, once the language is specified.
Where should I look? Or was I mistaken  Solr has nothing of the kind, at least 
in Java?
Thanks in advance for your help.
 
Best regards,
François Jurain.



  Retrouvez les 10 conseils pour économiser votre carburant sur Voila :  
http://actu.voila.fr/evenementiel/LeDossierEcologie/l-eco-conduite/





Solr throwing exception when evicting from filterCache

2011-03-24 Thread Matt Mitchell
I have a recent build of solr (4.0.0.2011.02.25.13.06.24). I am seeing this
error when making a request (with fq's), right at the point where the
eviction count goes from 0 up:

severe: java.lang.classcastexception: [ljava.lang.object; cannot be cast to
[lorg.apache.solr.common.util.concurrentlrucache$cacheentry

If you then make another request, Solr response with the expected result.

Is this a bug? Has anyone seen this before? Any tips/help/feedback/questions
would be much appreciated!

Thanks,
Matt


Re: Detecting an empty index during start-up

2011-03-24 Thread Chris Hostetter
: I am not familiar with Solr internals, so the approach I wanted to take was
: to basically check the numDocs property of the index during start-up and set
: a READABLE state in the ZooKeeper node if it's greater than 0. I also
: planned to create a commit hook for replication and updating which
: controlled the READABLE property based on numDocs also.
: 
: This just leaves the problem of finding out the number of documents during
: start-up. I planned to have something like:

Most of the ZK stuff you mentioned is over my head, but i get the general 
gist of what you want:

 * a hook on startup that checks numDocs
 * if not empty, trigger some logic

My suggestion would be to implement this as a firstSearcher 
SolrEventListener.  when that runs, you'll have easy access to a 
SOlrIndexSearcher (and you won't even have to refcount it) and you can 
fire whatever logic you want based on what you find when looking at it.


-Hoss


Re: how to run boost query for non-dismax query parser

2011-03-24 Thread cyang2010
Hi iorixxx,

Thanks for your reply.  yeah, an additional query with the boost value will
work.

However, I just wonder where you get the information that BoostQParserPlugin
only handles function query?

I looked up the javadoc, and still can't get that.  This is the javadoc.


Create a boosted query from the input value. The main value is the query to
be boosted.
Other parameters: b, the function query to use as the boost. 


This just say if b value is specified it is a function query.   I just don't
understand why dismaxParser has both bf and bq, but for BoostQParserPlugin
there is only bf equivalent. 

Another question is by specifying localParameter like that in query, does it
mean to use the default LuceneQParserPlugin primarily and only use
BoostQParserPlugin for the content with the {}?

Thanks.  look forward to your reply,


cy

--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-run-boost-query-for-non-dismax-query-parser-tp2723442p2726422.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr throwing exception when evicting from filterCache

2011-03-24 Thread Matt Mitchell
Here's the full stack trace:

[Ljava.lang.Object; cannot be cast to
[Lorg.apache.solr.common.util.ConcurrentLRUCache$CacheEntry;
java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to
[Lorg.apache.solr.common.util.ConcurrentLRUCache$CacheEntry; at
org.apache.solr.common.util.ConcurrentLRUCache$PQueue.myInsertWithOverflow(ConcurrentLRUCache.java:377)
at
org.apache.solr.common.util.ConcurrentLRUCache.markAndSweep(ConcurrentLRUCache.java:329)
at
org.apache.solr.common.util.ConcurrentLRUCache.put(ConcurrentLRUCache.java:144)
at org.apache.solr.search.FastLRUCache.put(FastLRUCache.java:131) at
org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:613)
at
org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:652)
at
org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1233)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1086)
at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:337)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:431)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:231)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1298) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:340)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at
org.mortbay.jetty.handler.ContextHandlerCollection.h

On Thu, Mar 24, 2011 at 1:54 PM, Matt Mitchell goodie...@gmail.com wrote:

 I have a recent build of solr (4.0.0.2011.02.25.13.06.24). I am seeing this
 error when making a request (with fq's), right at the point where the
 eviction count goes from 0 up:

 severe: java.lang.classcastexception: [ljava.lang.object; cannot be cast to
 [lorg.apache.solr.common.util.concurrentlrucache$cacheentry

 If you then make another request, Solr response with the expected result.

 Is this a bug? Has anyone seen this before? Any
 tips/help/feedback/questions would be much appreciated!

 Thanks,
 Matt



Re: Solr throwing exception when evicting from filterCache

2011-03-24 Thread Yonik Seeley
On Thu, Mar 24, 2011 at 1:54 PM, Matt Mitchell goodie...@gmail.com wrote:
 I have a recent build of solr (4.0.0.2011.02.25.13.06.24). I am seeing this
 error when making a request (with fq's), right at the point where the
 eviction count goes from 0 up:

Yep, this was a bug that has since been fixed.

-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco


Re: how to run boost query for non-dismax query parser

2011-03-24 Thread Ahmet Arslan
 Thanks for your reply.  yeah, an additional query with
 the boost value will
 work.
 
 However, I just wonder where you get the information that
 BoostQParserPlugin
 only handles function query?
 
 I looked up the javadoc, and still can't get that. 
 This is the javadoc.
 
 
 Create a boosted query from the input value. The main value
 is the query to
 be boosted.
 Other parameters: b, the function query to use as the
 boost. 
 
 
 This just say if b value is specified it is a function
 query.   

As you and wiki said, b is the function query to use as the boost.

 I just don't
 understand why dismaxParser has both bf and bq, but for
 BoostQParserPlugin
 there is only bf equivalent. 

I don't know that :) However optional clauses with LuceneQParserPlugin will do 
the same effect as dismax's bq.
 
 Another question is by specifying localParameter like that
 in query, does it
 mean to use the default LuceneQParserPlugin primarily and
 only use
 BoostQParserPlugin for the content with the {}?

Not only for BoostQParserPlugin.

http://wiki.apache.org/solr/LocalParams

http://wiki.apache.org/solr/SimpleFacetParameters#Multi-Select_Faceting_and_LocalParams






Fuzzy query using dismax query parser

2011-03-24 Thread cyang2010
Hi,

I wonder how to conduct fuzzy query using dismax query parser?  I am able to
do prefix query with local params and prefixQueryParser.  But how to handle
fuzzy query?  

I like the behavior of dismax except it does not support the prefix query
and fuzzy query.

Thanks.

cy

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Fuzzy-query-using-dismax-query-parser-tp2727075p2727075.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: invert terms in search with exact match

2011-03-24 Thread Ahmet Arslan

Then you need to write some custom code for that. Lucene in Action Book (second 
edition, section 6.3.4) has an example for Translating PhraseQuery to 
SpanNearQuery. 

Just use false for the third parameter in SpanNearQuery's ctor.


You can plug https://issues.apache.org/jira/browse/SOLR-1604 too.


 yes sorry i made  a mistake
 
 title(my AND love AND darling)
 
 all three words have to match. the problem is always i
 don't want results
 with other words.
 
 
 2011/3/24 Dario Rigolin dario.rigo...@comperio.it
 
  On Thursday, March 24, 2011 03:52:31 pm Gastone Penzo
 wrote:
 
  
   title1: my love darling
   title2: my darling love
   title3: darling my love
   title4: love my darling
 
  Sorry but simply search for:
 
 
   title:( my OR love OR darling)
  If you have default operator OR you don't need to put
 OR on the query
 
  Best regards.
  Dario Rigolin
  Comperio srl (Italy)
 
 
 
 
 -- 
 Gastone Penzo
 *www.solr-italia.it*
 *The first italian blog about Apache Solr*
 





Newbie wants to index XML content.

2011-03-24 Thread Marcelo Iturbe
Hello,
I've been reading up on how to index XML content but have a few questions.

How is data in element attributes handled or defined? How are nested
elements handled?

In the following XML structure, I want to index the content of what is
between the entry tags.
In one XML document, there can be up to 100 entry tags.
So the entry tag would be equivalent to the doc tag...

Can I somehow index this XML as is or will I have to parse it, creating
the doc tag and placing all the elements on the same level?

Thanks for your help.

?xml version=1.0 encoding=utf-8?
root
sourcemanual/source
author
nameMC Anon User/name
emailmca...@mcdomain.com/email
/author

entry
name
fullnameJohn Smith/fullname
/name
emailjsmit...@gmail.com/email
/entry

entry
name
fullnameFirst Last/fullname
firstnameFirst/firstname
lastnameLast/lastname
/name
organization
nameMC S.A./name
tittleCIO/tittle
/organization
email type=work primary=truefi...@mcdomain.com/email
emailflas...@yahoo.com/email
phoneNumber type=work primary=true+5629460600/phoneNumber
im carrier=gtalk primary=truefi...@mcdomain.com/im
im carrier=skypeFirst.Last/im
postalAddress111 Bude St, Toronto/postalAddress
custom name=bloghttp://blog.mcdomain.com//custom
/entry
/root

regards
Marcelo
WebRep
Overall rating


RE: Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Brandon Waterloo
Well, there lies the problem--it's not JUST the Tika jar.  If it's not one 
thing, it's another, and I'm not even sure which directory Solr actually looks 
in.  In my Solr.xml file I have it use a shared library folder for every core.  
Since each core will be holding very homologous data, there's no need to have 
any different library modules for each.

The relevant line in my solr.xml file is solr persistent=true 
sharedLib=lib.  That is housed in .../example/solr/.  So, does it look in 
.../example/lib or .../example/solr/lib?

~Brandon Waterloo

From: Markus Jelsma [markus.jel...@openindex.io]
Sent: Thursday, March 24, 2011 11:29 AM
To: solr-user@lucene.apache.org
Cc: Brandon Waterloo
Subject: Re: Multiple Cores with Solr Cell for indexing documents

Sounds like the Tika jar is not on the class path. Add it to a directory where
Solr's looking for libs.

On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
 Hello everyone,

 I've been trying for several hours now to set up Solr with multiple cores
 with Solr Cell working on each core. The only items being indexed are PDF,
 DOC, and TXT files (with the possibility of expanding this list, but for
 now, just assume the only things in the index should be documents).

 I never had any problems with Solr Cell when I was using a single core. In
 fact, I just ran the default installation in example/ and worked from
 that. However, trying to migrate to multi-core has been a never ending
 list of problems.

 Any time I try to add a document to the index (using the same curl command
 as I did to add to the single core, of course adding the core name to the
 request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors
 due to classes not being found and/or lazy loading errors. I've copied the
 exact example/lib directory into the cores, and that doesn't work either.

 Frankly the only libraries I want are those relevant to indexing files. The
 less bloat, the better, after all. However, I cannot figure out where to
 put what files, and why the example installation works perfectly for
 single-core but not with multi-cores.

 Here is an example of the errors I'm receiving:

 command prompt curl
 host/solr/core0/update/extract?literal.id=2-3-1commit=true -F
 myfile=@test2.txt

 html
 head
 meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/
 titleError 500 /title
 /head
 bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException

 java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:247)
 at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:
 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
 org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH
 andler(RequestHandlers.java:240) at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque
 st(RequestHandlers.java:231) at
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
 :338) at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
 a:241) at
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
 er.java:1089) at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
 at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216
 ) at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
 at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
 at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
 at
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo
 llection.java:211) at
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
 114) at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
 at org.mortbay.jetty.Server.handle(Server.java:285)
 at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
 at
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav
 a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
 at
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:
 226) at
 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java
 :442) Caused by: java.lang.ClassNotFoundException:
 org.apache.tika.exception.TikaException at
 java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at 

Re: Fuzzy query using dismax query parser

2011-03-24 Thread Ahmet Arslan
 I wonder how to conduct fuzzy query using dismax query
 parser?  I am able to
 do prefix query with local params and
 prefixQueryParser.  But how to handle
 fuzzy query?  
 
 I like the behavior of dismax except it does not support
 the prefix query
 and fuzzy query.

You may interested in https://issues.apache.org/jira/browse/SOLR-1553





Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Brandon Waterloo
Well, there lies the problem--it's not JUST the Tika jar.  If it's not one 
thing, it's another, and I'm not even sure which directory Solr actually looks 
in.  In my Solr.xml file I have it use a shared library folder for every core.  
Since each core will be holding very homologous data, there's no need to have 
any different library modules for each.

The relevant line in my solr.xml file is solr persistent=true 
sharedLib=lib.  That is housed in .../example/solr/.  So, does it look in 
.../example/lib or .../example/solr/lib?

~Brandon Waterloo

From: Markus Jelsma [markus.jel...@openindex.io]
Sent: Thursday, March 24, 2011 11:29 AM
To: solr-user@lucene.apache.org
Cc: Brandon Waterloo
Subject: Re: Multiple Cores with Solr Cell for indexing documents

Sounds like the Tika jar is not on the class path. Add it to a directory where
Solr's looking for libs.

On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
 Hello everyone,

 I've been trying for several hours now to set up Solr with multiple cores
 with Solr Cell working on each core. The only items being indexed are PDF,
 DOC, and TXT files (with the possibility of expanding this list, but for
 now, just assume the only things in the index should be documents).

 I never had any problems with Solr Cell when I was using a single core. In
 fact, I just ran the default installation in example/ and worked from
 that. However, trying to migrate to multi-core has been a never ending
 list of problems.

 Any time I try to add a document to the index (using the same curl command
 as I did to add to the single core, of course adding the core name to the
 request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors
 due to classes not being found and/or lazy loading errors. I've copied the
 exact example/lib directory into the cores, and that doesn't work either.

 Frankly the only libraries I want are those relevant to indexing files. The
 less bloat, the better, after all. However, I cannot figure out where to
 put what files, and why the example installation works perfectly for
 single-core but not with multi-cores.

 Here is an example of the errors I'm receiving:

 command prompt curl
 host/solr/core0/update/extract?literal.id=2-3-1commit=true -F
 myfile=@test2.txt

 html
 head
 meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/
 titleError 500 /title
 /head
 bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException

 java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:247)
 at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:
 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
 org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH
 andler(RequestHandlers.java:240) at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque
 st(RequestHandlers.java:231) at
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
 :338) at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
 a:241) at
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
 er.java:1089) at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
 at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216
 ) at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
 at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
 at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
 at
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo
 llection.java:211) at
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
 114) at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
 at org.mortbay.jetty.Server.handle(Server.java:285)
 at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
 at
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav
 a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
 at
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:
 226) at
 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java
 :442) Caused by: java.lang.ClassNotFoundException:
 org.apache.tika.exception.TikaException at
 java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at 

Re: how to run boost query for non-dismax query parser

2011-03-24 Thread cyang2010
iorixxx, thanks for your reply.

Another a little bit off topic question.  I looked over all the subclasses
of QParserPlugin.  It seesm like most of them provide complementary parsing
to the default lucene/solr parser.   Except prefixParser.  What is the
intended usage of that one?  The default lucene/solr parser is able to parse
prefix query.  Is the intended usage with dismax parser?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-run-boost-query-for-non-dismax-query-parser-tp2723442p2727566.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Fuzzy query using dismax query parser

2011-03-24 Thread cyang2010
OK, i will have to wait till solr 3 release then.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Fuzzy-query-using-dismax-query-parser-tp2727075p2727572.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Markus Jelsma
I believe it's example/solr/lib where it looks for shared libs in multicore. 
But, each core can has its own lib dir, usually in core/lib. This is 
referenced to in solrconfig.xml, see the example config for the lib directive.

 Well, there lies the problem--it's not JUST the Tika jar.  If it's not one
 thing, it's another, and I'm not even sure which directory Solr actually
 looks in.  In my Solr.xml file I have it use a shared library folder for
 every core.  Since each core will be holding very homologous data, there's
 no need to have any different library modules for each.
 
 The relevant line in my solr.xml file is solr persistent=true
 sharedLib=lib.  That is housed in .../example/solr/.  So, does it look
 in .../example/lib or .../example/solr/lib?
 
 ~Brandon Waterloo
 
 From: Markus Jelsma [markus.jel...@openindex.io]
 Sent: Thursday, March 24, 2011 11:29 AM
 To: solr-user@lucene.apache.org
 Cc: Brandon Waterloo
 Subject: Re: Multiple Cores with Solr Cell for indexing documents
 
 Sounds like the Tika jar is not on the class path. Add it to a directory
 where Solr's looking for libs.
 
 On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
  Hello everyone,
  
  I've been trying for several hours now to set up Solr with multiple cores
  with Solr Cell working on each core. The only items being indexed are
  PDF, DOC, and TXT files (with the possibility of expanding this list,
  but for now, just assume the only things in the index should be
  documents).
  
  I never had any problems with Solr Cell when I was using a single core.
  In fact, I just ran the default installation in example/ and worked from
  that. However, trying to migrate to multi-core has been a never ending
  list of problems.
  
  Any time I try to add a document to the index (using the same curl
  command as I did to add to the single core, of course adding the core
  name to the request URL-- host/solr/corename/update/extract...), I get
  HTTP 500 errors due to classes not being found and/or lazy loading
  errors. I've copied the exact example/lib directory into the cores, and
  that doesn't work either.
  
  Frankly the only libraries I want are those relevant to indexing files.
  The less bloat, the better, after all. However, I cannot figure out
  where to put what files, and why the example installation works
  perfectly for single-core but not with multi-cores.
  
  Here is an example of the errors I'm receiving:
  
  command prompt curl
  host/solr/core0/update/extract?literal.id=2-3-1commit=true -F
  myfile=@test2.txt
  
  html
  head
  meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/
  titleError 500 /title
  /head
  bodyh2HTTP ERROR:
  500/h2preorg/apache/tika/exception/TikaException
  
  java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
  at java.lang.Class.forName0(Native Method)
  at java.lang.Class.forName(Class.java:247)
  at
  org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java
  : 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
  at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
  at
  org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappe
  dH andler(RequestHandlers.java:240) at
  org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequ
  e st(RequestHandlers.java:231) at
  org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
  at
  org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.jav
  a
  
  :338) at
  
  org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja
  v a:241) at
  org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHand
  l er.java:1089) at
  org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
  at
  org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:21
  6 ) at
  org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
  at
  org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
  at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
  at
  org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerC
  o llection.java:211) at
  org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java
  : 114) at
  org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
  at org.mortbay.jetty.Server.handle(Server.java:285)
  at
  org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
  at
  org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.ja
  v a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
  at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at
  org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at
  org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java
  : 226) at
  

Re: solr on the cloud

2011-03-24 Thread Otis Gospodnetic
Hi,


 I have tried running the sharded solr with zoo keeper on a  single machine.

 The SOLR code is from current trunk. It runs nicely. Can you  please point me
 to a page, where I can check the status of the solr on the  cloud development
 and available features, apart from http://wiki.apache.org/solr/SolrCloud ?

I'm afraid that's the most comprehensive documentation so far.

 Basically, of high interest  is checking out the Map-Reduce for distributed
 faceting, is it even possible  with the trunk?

Hm, MR for distributed faceting?  Maybe I missed this... can you point to a 
place that mentions this?

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


Re: [ANNOUNCEMENT] solr-packager 1.0.2 released!

2011-03-24 Thread Otis Gospodnetic
Hi Simone,

This is handy!
Any chance you'll be adding a version with Jetty 7.* ?

Thanks,
Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Simone Tripodi simonetrip...@apache.org
 To: solr-user@lucene.apache.org
 Sent: Sat, March 19, 2011 8:13:36 PM
 Subject: [ANNOUNCEMENT] solr-packager 1.0.2 released!
 
 Hi all,
 The Sourcesense's Solr Packager team is pleased to announce  the
 solr-packager-site-1.0.2 release!
 
 Solr-Packager is a Maven  archetype to package Standalone Apache Solr
 embedded in Tomcat,
  brought to you by Sourcesense
 
 Changes in this version  include:
 
 
 Fixed Bugs:
 o Custom context root.  Issue: 4.
 o  Slave classifier doesn't get installed in M2 local repo.  Issue:  5.
 
 More informations on http://sourcesense.github.com/solr-packager/
 
 
 Have fun!
 -  Simone Tripodi, on behalf of Sourcesense
 
 http://people.apache.org/~simonetripodi/
 http://www.99soft.org/
 


stopwords not working in multicore setup

2011-03-24 Thread Christopher Bottaro
Hello,

I'm running a Solr server with 5 cores.  Three are for English content and
two are for German content.  The default stopwords setup works fine for the
English cores, but the German stopwords aren't working.

The German stopwords file is stopwords-de.txt and resides in the same
directory as stopwords.txt.  The German cores use a different schema (named
schema.page.de.xml) which has the following text field definition:
http://pastie.org/1711866

The stopwords-de.txt file looks like this:  http://pastie.org/1711869

The query I'm doing is this:  q = title:für

And it's returning documents with für in the title.  Title is a text field
which should use the stopwords-de.txt, as seen in the aforementioned pastie.

Any ideas?  Thanks for the help.