Re: Multiple Cores with Solr Cell for indexing documents

2011-03-25 Thread Upayavira
There's options in solr.xml that point to lib dirs. Make sure you get
them right.

Upayavira

On Thu, 24 Mar 2011 23:28 +0100, Markus Jelsma
markus.jel...@openindex.io wrote:
 I believe it's example/solr/lib where it looks for shared libs in
 multicore. 
 But, each core can has its own lib dir, usually in core/lib. This is 
 referenced to in solrconfig.xml, see the example config for the lib
 directive.
 
  Well, there lies the problem--it's not JUST the Tika jar.  If it's not one
  thing, it's another, and I'm not even sure which directory Solr actually
  looks in.  In my Solr.xml file I have it use a shared library folder for
  every core.  Since each core will be holding very homologous data, there's
  no need to have any different library modules for each.
  
  The relevant line in my solr.xml file is solr persistent=true
  sharedLib=lib.  That is housed in .../example/solr/.  So, does it look
  in .../example/lib or .../example/solr/lib?
  
  ~Brandon Waterloo
  
  From: Markus Jelsma [markus.jel...@openindex.io]
  Sent: Thursday, March 24, 2011 11:29 AM
  To: solr-user@lucene.apache.org
  Cc: Brandon Waterloo
  Subject: Re: Multiple Cores with Solr Cell for indexing documents
  
  Sounds like the Tika jar is not on the class path. Add it to a directory
  where Solr's looking for libs.
  
  On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
   Hello everyone,
   
   I've been trying for several hours now to set up Solr with multiple cores
   with Solr Cell working on each core. The only items being indexed are
   PDF, DOC, and TXT files (with the possibility of expanding this list,
   but for now, just assume the only things in the index should be
   documents).
   
   I never had any problems with Solr Cell when I was using a single core.
   In fact, I just ran the default installation in example/ and worked from
   that. However, trying to migrate to multi-core has been a never ending
   list of problems.
   
   Any time I try to add a document to the index (using the same curl
   command as I did to add to the single core, of course adding the core
   name to the request URL-- host/solr/corename/update/extract...), I get
   HTTP 500 errors due to classes not being found and/or lazy loading
   errors. I've copied the exact example/lib directory into the cores, and
   that doesn't work either.
   
   Frankly the only libraries I want are those relevant to indexing files.
   The less bloat, the better, after all. However, I cannot figure out
   where to put what files, and why the example installation works
   perfectly for single-core but not with multi-cores.
   
   Here is an example of the errors I'm receiving:
   
   command prompt curl
   host/solr/core0/update/extract?literal.id=2-3-1commit=true -F
   myfile=@test2.txt
   
   html
   head
   meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/
   titleError 500 /title
   /head
   bodyh2HTTP ERROR:
   500/h2preorg/apache/tika/exception/TikaException
   
   java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:247)
   at
   org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java
   : 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
   at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
   at
   org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappe
   dH andler(RequestHandlers.java:240) at
   org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequ
   e st(RequestHandlers.java:231) at
   org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
   at
   org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.jav
   a
   
   :338) at
   
   org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja
   v a:241) at
   org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHand
   l er.java:1089) at
   org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
   at
   org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:21
   6 ) at
   org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
   at
   org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
   at
   org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerC
   o llection.java:211) at
   org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java
   : 114) at
   org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
   at org.mortbay.jetty.Server.handle(Server.java:285)
   at
   org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
   at
   org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.ja
   v a:835) at org.mortbay.jetty.HttpParser.parseNext

RE: Multiple Cores with Solr Cell for indexing documents

2011-03-25 Thread Brandon Waterloo
I did finally manage to deploy Solr with multiple cores but we've been running 
into so many problems with permissions, index location, and other things that I 
(quite fortunately) convinced my boss that multiple cores are not the way to go 
here.  I had in place a single-core system that would filter the results based 
on their ID numbers, and show only the subset of results that you wanted to 
see.  The disadvantage is that it's a single core and thus will take longer to 
search over the entire index.  The advantage is that it's better in every other 
way.

So the plan now is to move back to single-core searching and then test it with 
a huge amount of documents to see whether performance is seriously impacted or 
not.  So for now, I guess we can consider this thread resolved.

Thanks for all your help guys!

~Brandon Waterloo



From: Markus Jelsma [markus.jel...@openindex.io]
Sent: Friday, March 25, 2011 1:23 PM
To: solr-user@lucene.apache.org
Cc: Upayavira
Subject: Re: Multiple Cores with Solr Cell for indexing documents

You can only set properties for a lib dir that must be used in solrconfig.xml.
You can use sharedLib in solr.xml though.

 There's options in solr.xml that point to lib dirs. Make sure you get
 them right.

 Upayavira

 On Thu, 24 Mar 2011 23:28 +0100, Markus Jelsma

 markus.jel...@openindex.io wrote:
  I believe it's example/solr/lib where it looks for shared libs in
  multicore.
  But, each core can has its own lib dir, usually in core/lib. This is
  referenced to in solrconfig.xml, see the example config for the lib
  directive.
 
   Well, there lies the problem--it's not JUST the Tika jar.  If it's not
   one thing, it's another, and I'm not even sure which directory Solr
   actually looks in.  In my Solr.xml file I have it use a shared library
   folder for every core.  Since each core will be holding very
   homologous data, there's no need to have any different library modules
   for each.
  
   The relevant line in my solr.xml file is solr persistent=true
   sharedLib=lib.  That is housed in .../example/solr/.  So, does it
   look in .../example/lib or .../example/solr/lib?
  
   ~Brandon Waterloo
   
   From: Markus Jelsma [markus.jel...@openindex.io]
   Sent: Thursday, March 24, 2011 11:29 AM
   To: solr-user@lucene.apache.org
   Cc: Brandon Waterloo
   Subject: Re: Multiple Cores with Solr Cell for indexing documents
  
   Sounds like the Tika jar is not on the class path. Add it to a
   directory where Solr's looking for libs.
  
   On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
Hello everyone,
   
I've been trying for several hours now to set up Solr with multiple
cores with Solr Cell working on each core. The only items being
indexed are PDF, DOC, and TXT files (with the possibility of
expanding this list, but for now, just assume the only things in the
index should be documents).
   
I never had any problems with Solr Cell when I was using a single
core. In fact, I just ran the default installation in example/ and
worked from that. However, trying to migrate to multi-core has been
a never ending list of problems.
   
Any time I try to add a document to the index (using the same curl
command as I did to add to the single core, of course adding the core
name to the request URL-- host/solr/corename/update/extract...), I
get HTTP 500 errors due to classes not being found and/or lazy
loading errors. I've copied the exact example/lib directory into the
cores, and that doesn't work either.
   
Frankly the only libraries I want are those relevant to indexing
files. The less bloat, the better, after all. However, I cannot
figure out where to put what files, and why the example installation
works perfectly for single-core but not with multi-cores.
   
Here is an example of the errors I'm receiving:
   
command prompt curl
host/solr/core0/update/extract?literal.id=2-3-1commit=true -F
myfile=@test2.txt
   
html
head
meta http-equiv=Content-Type content=text/html;
charset=ISO-8859-1/ titleError 500 /title
/head
bodyh2HTTP ERROR:
500/h2preorg/apache/tika/exception/TikaException
   
java.lang.NoClassDefFoundError:
org/apache/tika/exception/TikaException at
java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.
java
   
: 359) at
: org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
   
at
org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449
) at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWra
ppe dH andler(RequestHandlers.java:240) at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle
Requ e st(RequestHandlers.java:231

Re: Multiple Cores with Solr Cell for indexing documents

2011-03-25 Thread Erick Erickson
Right, and you can go to sharding rather than managing your multiple
cores if thats warranted.

Erick

On Fri, Mar 25, 2011 at 1:31 PM, Brandon Waterloo
brandon.water...@matrix.msu.edu wrote:
 I did finally manage to deploy Solr with multiple cores but we've been 
 running into so many problems with permissions, index location, and other 
 things that I (quite fortunately) convinced my boss that multiple cores are 
 not the way to go here.  I had in place a single-core system that would 
 filter the results based on their ID numbers, and show only the subset of 
 results that you wanted to see.  The disadvantage is that it's a single core 
 and thus will take longer to search over the entire index.  The advantage is 
 that it's better in every other way.

 So the plan now is to move back to single-core searching and then test it 
 with a huge amount of documents to see whether performance is seriously 
 impacted or not.  So for now, I guess we can consider this thread resolved.

 Thanks for all your help guys!

 ~Brandon Waterloo


 
 From: Markus Jelsma [markus.jel...@openindex.io]
 Sent: Friday, March 25, 2011 1:23 PM
 To: solr-user@lucene.apache.org
 Cc: Upayavira
 Subject: Re: Multiple Cores with Solr Cell for indexing documents

 You can only set properties for a lib dir that must be used in solrconfig.xml.
 You can use sharedLib in solr.xml though.

 There's options in solr.xml that point to lib dirs. Make sure you get
 them right.

 Upayavira

 On Thu, 24 Mar 2011 23:28 +0100, Markus Jelsma

 markus.jel...@openindex.io wrote:
  I believe it's example/solr/lib where it looks for shared libs in
  multicore.
  But, each core can has its own lib dir, usually in core/lib. This is
  referenced to in solrconfig.xml, see the example config for the lib
  directive.
 
   Well, there lies the problem--it's not JUST the Tika jar.  If it's not
   one thing, it's another, and I'm not even sure which directory Solr
   actually looks in.  In my Solr.xml file I have it use a shared library
   folder for every core.  Since each core will be holding very
   homologous data, there's no need to have any different library modules
   for each.
  
   The relevant line in my solr.xml file is solr persistent=true
   sharedLib=lib.  That is housed in .../example/solr/.  So, does it
   look in .../example/lib or .../example/solr/lib?
  
   ~Brandon Waterloo
   
   From: Markus Jelsma [markus.jel...@openindex.io]
   Sent: Thursday, March 24, 2011 11:29 AM
   To: solr-user@lucene.apache.org
   Cc: Brandon Waterloo
   Subject: Re: Multiple Cores with Solr Cell for indexing documents
  
   Sounds like the Tika jar is not on the class path. Add it to a
   directory where Solr's looking for libs.
  
   On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
Hello everyone,
   
I've been trying for several hours now to set up Solr with multiple
cores with Solr Cell working on each core. The only items being
indexed are PDF, DOC, and TXT files (with the possibility of
expanding this list, but for now, just assume the only things in the
index should be documents).
   
I never had any problems with Solr Cell when I was using a single
core. In fact, I just ran the default installation in example/ and
worked from that. However, trying to migrate to multi-core has been
a never ending list of problems.
   
Any time I try to add a document to the index (using the same curl
command as I did to add to the single core, of course adding the core
name to the request URL-- host/solr/corename/update/extract...), I
get HTTP 500 errors due to classes not being found and/or lazy
loading errors. I've copied the exact example/lib directory into the
cores, and that doesn't work either.
   
Frankly the only libraries I want are those relevant to indexing
files. The less bloat, the better, after all. However, I cannot
figure out where to put what files, and why the example installation
works perfectly for single-core but not with multi-cores.
   
Here is an example of the errors I'm receiving:
   
command prompt curl
host/solr/core0/update/extract?literal.id=2-3-1commit=true -F
myfile=@test2.txt
   
html
head
meta http-equiv=Content-Type content=text/html;
charset=ISO-8859-1/ titleError 500 /title
/head
bodyh2HTTP ERROR:
500/h2preorg/apache/tika/exception/TikaException
   
java.lang.NoClassDefFoundError:
org/apache/tika/exception/TikaException at
java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.
java
   
: 359) at
: org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
   
at
org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449

Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Brandon Waterloo
Hello everyone,

I've been trying for several hours now to set up Solr with multiple cores with 
Solr Cell working on each core. The only items being indexed are PDF, DOC, and 
TXT files (with the possibility of expanding this list, but for now, just 
assume the only things in the index should be documents).

I never had any problems with Solr Cell when I was using a single core. In 
fact, I just ran the default installation in example/ and worked from that. 
However, trying to migrate to multi-core has been a never ending list of 
problems.

Any time I try to add a document to the index (using the same curl command as I 
did to add to the single core, of course adding the core name to the request 
URL-- host/solr/corename/update/extract...), I get HTTP 500 errors due to 
classes not being found and/or lazy loading errors. I've copied the exact 
example/lib directory into the cores, and that doesn't work either.

Frankly the only libraries I want are those relevant to indexing files. The 
less bloat, the better, after all. However, I cannot figure out where to put 
what files, and why the example installation works perfectly for single-core 
but not with multi-cores.

Here is an example of the errors I'm receiving:

command prompt curl 
host/solr/core0/update/extract?literal.id=2-3-1commit=true -F 
myfile=@test2.txt

html
head
meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/
titleError 500 /title
/head
bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException

java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:240)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Caused by: java.lang.ClassNotFoundException: 
org.apache.tika.exception.TikaException
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 27 more
/pre
pRequestURI=/solr/core0/update/extract/ppismalla 
href=http://jetty.mortbay.org/;Powered by Jetty:///a/small/i/pbr/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/

/body
/html

Any assistance you could provide or installation guides/tutorials/etc. that you 
could link me to would be greatly appreciated. Thank you all for your time!

~Brandon Waterloo



Re: Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Markus Jelsma
Sounds like the Tika jar is not on the class path. Add it to a directory where 
Solr's looking for libs.

On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
 Hello everyone,
 
 I've been trying for several hours now to set up Solr with multiple cores
 with Solr Cell working on each core. The only items being indexed are PDF,
 DOC, and TXT files (with the possibility of expanding this list, but for
 now, just assume the only things in the index should be documents).
 
 I never had any problems with Solr Cell when I was using a single core. In
 fact, I just ran the default installation in example/ and worked from
 that. However, trying to migrate to multi-core has been a never ending
 list of problems.
 
 Any time I try to add a document to the index (using the same curl command
 as I did to add to the single core, of course adding the core name to the
 request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors
 due to classes not being found and/or lazy loading errors. I've copied the
 exact example/lib directory into the cores, and that doesn't work either.
 
 Frankly the only libraries I want are those relevant to indexing files. The
 less bloat, the better, after all. However, I cannot figure out where to
 put what files, and why the example installation works perfectly for
 single-core but not with multi-cores.
 
 Here is an example of the errors I'm receiving:
 
 command prompt curl
 host/solr/core0/update/extract?literal.id=2-3-1commit=true -F
 myfile=@test2.txt
 
 html
 head
 meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/
 titleError 500 /title
 /head
 bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException
 
 java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:247)
 at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:
 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
 org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH
 andler(RequestHandlers.java:240) at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque
 st(RequestHandlers.java:231) at
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
 :338) at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
 a:241) at
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
 er.java:1089) at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
 at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216
 ) at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
 at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
 at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
 at
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo
 llection.java:211) at
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
 114) at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
 at org.mortbay.jetty.Server.handle(Server.java:285)
 at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
 at
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav
 a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
 at
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:
 226) at
 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java
 :442) Caused by: java.lang.ClassNotFoundException:
 org.apache.tika.exception.TikaException at
 java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
 ... 27 more
 /pre
 pRequestURI=/solr/core0/update/extract/ppismalla
 href=http://jetty.mortbay.org/;Powered by
 Jetty:///a/small/i/pbr/ br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 br/
 
 /body
 /html
 
 Any assistance you could provide or installation guides/tutorials/etc. that
 you could link me to would be greatly appreciated. Thank you all for your
 time!
 
 ~Brandon Waterloo

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350


RE: Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Brandon Waterloo
Well, there lies the problem--it's not JUST the Tika jar.  If it's not one 
thing, it's another, and I'm not even sure which directory Solr actually looks 
in.  In my Solr.xml file I have it use a shared library folder for every core.  
Since each core will be holding very homologous data, there's no need to have 
any different library modules for each.

The relevant line in my solr.xml file is solr persistent=true 
sharedLib=lib.  That is housed in .../example/solr/.  So, does it look in 
.../example/lib or .../example/solr/lib?

~Brandon Waterloo

From: Markus Jelsma [markus.jel...@openindex.io]
Sent: Thursday, March 24, 2011 11:29 AM
To: solr-user@lucene.apache.org
Cc: Brandon Waterloo
Subject: Re: Multiple Cores with Solr Cell for indexing documents

Sounds like the Tika jar is not on the class path. Add it to a directory where
Solr's looking for libs.

On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
 Hello everyone,

 I've been trying for several hours now to set up Solr with multiple cores
 with Solr Cell working on each core. The only items being indexed are PDF,
 DOC, and TXT files (with the possibility of expanding this list, but for
 now, just assume the only things in the index should be documents).

 I never had any problems with Solr Cell when I was using a single core. In
 fact, I just ran the default installation in example/ and worked from
 that. However, trying to migrate to multi-core has been a never ending
 list of problems.

 Any time I try to add a document to the index (using the same curl command
 as I did to add to the single core, of course adding the core name to the
 request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors
 due to classes not being found and/or lazy loading errors. I've copied the
 exact example/lib directory into the cores, and that doesn't work either.

 Frankly the only libraries I want are those relevant to indexing files. The
 less bloat, the better, after all. However, I cannot figure out where to
 put what files, and why the example installation works perfectly for
 single-core but not with multi-cores.

 Here is an example of the errors I'm receiving:

 command prompt curl
 host/solr/core0/update/extract?literal.id=2-3-1commit=true -F
 myfile=@test2.txt

 html
 head
 meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/
 titleError 500 /title
 /head
 bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException

 java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:247)
 at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:
 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
 org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH
 andler(RequestHandlers.java:240) at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque
 st(RequestHandlers.java:231) at
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
 :338) at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
 a:241) at
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
 er.java:1089) at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
 at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216
 ) at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
 at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
 at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
 at
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo
 llection.java:211) at
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
 114) at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
 at org.mortbay.jetty.Server.handle(Server.java:285)
 at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
 at
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav
 a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
 at
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:
 226) at
 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java
 :442) Caused by: java.lang.ClassNotFoundException:
 org.apache.tika.exception.TikaException at
 java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass

Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Brandon Waterloo
Well, there lies the problem--it's not JUST the Tika jar.  If it's not one 
thing, it's another, and I'm not even sure which directory Solr actually looks 
in.  In my Solr.xml file I have it use a shared library folder for every core.  
Since each core will be holding very homologous data, there's no need to have 
any different library modules for each.

The relevant line in my solr.xml file is solr persistent=true 
sharedLib=lib.  That is housed in .../example/solr/.  So, does it look in 
.../example/lib or .../example/solr/lib?

~Brandon Waterloo

From: Markus Jelsma [markus.jel...@openindex.io]
Sent: Thursday, March 24, 2011 11:29 AM
To: solr-user@lucene.apache.org
Cc: Brandon Waterloo
Subject: Re: Multiple Cores with Solr Cell for indexing documents

Sounds like the Tika jar is not on the class path. Add it to a directory where
Solr's looking for libs.

On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
 Hello everyone,

 I've been trying for several hours now to set up Solr with multiple cores
 with Solr Cell working on each core. The only items being indexed are PDF,
 DOC, and TXT files (with the possibility of expanding this list, but for
 now, just assume the only things in the index should be documents).

 I never had any problems with Solr Cell when I was using a single core. In
 fact, I just ran the default installation in example/ and worked from
 that. However, trying to migrate to multi-core has been a never ending
 list of problems.

 Any time I try to add a document to the index (using the same curl command
 as I did to add to the single core, of course adding the core name to the
 request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors
 due to classes not being found and/or lazy loading errors. I've copied the
 exact example/lib directory into the cores, and that doesn't work either.

 Frankly the only libraries I want are those relevant to indexing files. The
 less bloat, the better, after all. However, I cannot figure out where to
 put what files, and why the example installation works perfectly for
 single-core but not with multi-cores.

 Here is an example of the errors I'm receiving:

 command prompt curl
 host/solr/core0/update/extract?literal.id=2-3-1commit=true -F
 myfile=@test2.txt

 html
 head
 meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/
 titleError 500 /title
 /head
 bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException

 java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:247)
 at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:
 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
 org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH
 andler(RequestHandlers.java:240) at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque
 st(RequestHandlers.java:231) at
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
 :338) at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
 a:241) at
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
 er.java:1089) at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
 at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216
 ) at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
 at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
 at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
 at
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo
 llection.java:211) at
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
 114) at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
 at org.mortbay.jetty.Server.handle(Server.java:285)
 at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
 at
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav
 a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
 at
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:
 226) at
 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java
 :442) Caused by: java.lang.ClassNotFoundException:
 org.apache.tika.exception.TikaException at
 java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass

Re: Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Markus Jelsma
I believe it's example/solr/lib where it looks for shared libs in multicore. 
But, each core can has its own lib dir, usually in core/lib. This is 
referenced to in solrconfig.xml, see the example config for the lib directive.

 Well, there lies the problem--it's not JUST the Tika jar.  If it's not one
 thing, it's another, and I'm not even sure which directory Solr actually
 looks in.  In my Solr.xml file I have it use a shared library folder for
 every core.  Since each core will be holding very homologous data, there's
 no need to have any different library modules for each.
 
 The relevant line in my solr.xml file is solr persistent=true
 sharedLib=lib.  That is housed in .../example/solr/.  So, does it look
 in .../example/lib or .../example/solr/lib?
 
 ~Brandon Waterloo
 
 From: Markus Jelsma [markus.jel...@openindex.io]
 Sent: Thursday, March 24, 2011 11:29 AM
 To: solr-user@lucene.apache.org
 Cc: Brandon Waterloo
 Subject: Re: Multiple Cores with Solr Cell for indexing documents
 
 Sounds like the Tika jar is not on the class path. Add it to a directory
 where Solr's looking for libs.
 
 On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
  Hello everyone,
  
  I've been trying for several hours now to set up Solr with multiple cores
  with Solr Cell working on each core. The only items being indexed are
  PDF, DOC, and TXT files (with the possibility of expanding this list,
  but for now, just assume the only things in the index should be
  documents).
  
  I never had any problems with Solr Cell when I was using a single core.
  In fact, I just ran the default installation in example/ and worked from
  that. However, trying to migrate to multi-core has been a never ending
  list of problems.
  
  Any time I try to add a document to the index (using the same curl
  command as I did to add to the single core, of course adding the core
  name to the request URL-- host/solr/corename/update/extract...), I get
  HTTP 500 errors due to classes not being found and/or lazy loading
  errors. I've copied the exact example/lib directory into the cores, and
  that doesn't work either.
  
  Frankly the only libraries I want are those relevant to indexing files.
  The less bloat, the better, after all. However, I cannot figure out
  where to put what files, and why the example installation works
  perfectly for single-core but not with multi-cores.
  
  Here is an example of the errors I'm receiving:
  
  command prompt curl
  host/solr/core0/update/extract?literal.id=2-3-1commit=true -F
  myfile=@test2.txt
  
  html
  head
  meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/
  titleError 500 /title
  /head
  bodyh2HTTP ERROR:
  500/h2preorg/apache/tika/exception/TikaException
  
  java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
  at java.lang.Class.forName0(Native Method)
  at java.lang.Class.forName(Class.java:247)
  at
  org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java
  : 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
  at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
  at
  org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappe
  dH andler(RequestHandlers.java:240) at
  org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequ
  e st(RequestHandlers.java:231) at
  org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
  at
  org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.jav
  a
  
  :338) at
  
  org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja
  v a:241) at
  org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHand
  l er.java:1089) at
  org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
  at
  org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:21
  6 ) at
  org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
  at
  org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
  at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
  at
  org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerC
  o llection.java:211) at
  org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java
  : 114) at
  org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
  at org.mortbay.jetty.Server.handle(Server.java:285)
  at
  org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
  at
  org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.ja
  v a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
  at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at
  org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at
  org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java
  : 226) at
  org.mortbay.thread.BoundedThreadPool

Multiple Cores with Solr Cell for indexing documents

2011-03-22 Thread Brandon Waterloo
Hello everyone,

I've been trying for several hours now to set up Solr with multiple cores with 
Solr Cell working on each core.  The only items being indexed are PDF, DOC, and 
TXT files (with the possibility of expanding this list, but for now, just 
assume the only things in the index should be documents).

I never had any problems with Solr Cell when I was using a single core.  In 
fact, I just ran the default installation in example/ and worked from that.  
However, trying to migrate to multi-core has been a never ending list of 
problems.

Any time I try to add a document to the index (using the same curl command as I 
did to add to the single core, of course adding the core name to the request 
URL-- host/solr/corename/update/extract...), I get HTTP 500 errors due to 
classes not being found and/or lazy loading errors.  I've copied the exact 
example/lib directory into the cores, and that doesn't work either.

Frankly the only libraries I want are those relevant to indexing files.  The 
less bloat, the better, after all.  However, I cannot figure out where to put 
what files, and why the example installation works perfectly for single-core 
but not with multi-cores.

Here is an example of the errors I'm receiving:

command prompt curl 
host/solr/core0/update/extract?literal.id=2-3-1commit=true -F 
myfile=@test2.txt

html
head
meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/
titleError 500 /title
/head
bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException

java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:240)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Caused by: java.lang.ClassNotFoundException: 
org.apache.tika.exception.TikaException
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 27 more
/pre
pRequestURI=/solr/core0/update/extract/ppismalla 
href=http://jetty.mortbay.org/;Powered by Jetty:///a/small/i/pbr/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/
br/

/body
/html

Any assistance you could provide or installation guides/tutorials/etc. that you 
could link me to would be greatly appreciated.  Thank you all for your time!

~Brandon Waterloo