Re: Data Import Handler Rich Format Documents

2010-06-18 Thread Sixten Otto
On Fri, Jun 18, 2010 at 2:42 PM, Chris Hostetter hossman_luc...@fucit.org wrote: I'm confused ... You're using DIH, and some of your fields are URLs to documents that you want to parse with Tika? Why would you need a custom Transformer? Yeah, I can definitely vouch that DIH can handle this

Re: HOWTO get a working copy of SOLR?

2010-06-15 Thread Sixten Otto
On Tue, Jun 15, 2010 at 12:58 AM, Bernd Fehling bernd.fehl...@uni-bielefeld.de wrote: - changed to SOLR branch_3x. Installs fine, runs fine, luke works fine but  the extraction with /update/extract (ExtractingRequestHandler) only replies  the metadata but not the content. Sounds like

Re: Tomcat startup script

2010-06-09 Thread Sixten Otto
On Tue, Jun 8, 2010 at 4:18 PM, cbenn...@job.com wrote: The following should work on centos/redhat, don't forget to edit the paths, user, and java options for your environment. You can use chkconfig to add it to your startup. Thanks, Colin. Sixten

Re: Tomcat startup script

2010-06-08 Thread Sixten Otto
On Mon, Jun 7, 2010 at 9:23 PM, K Wong wongo...@gmail.com wrote: Did you install tomcat 5.5 from an RPM? I did not, on the advice of that same Solr wiki article that manual installation is recommended because distribution Tomcats are either old or quirky. There haven't been any issues with this,

Re: Tomcat startup script

2010-06-08 Thread Sixten Otto
On Tue, Jun 8, 2010 at 11:00 AM, K Wong wongo...@gmail.com wrote: Okay. I've been running multicore Solr 1.4 on Tomcat 5.5/OpenJDK 6 straight out of the centos repo and I've not had any issues. We're not doing anything wild and crazy with it though. It's nice to know that the wiki's advice

Re: TikaEntityProcessor on Solr 1.4?

2010-06-08 Thread Sixten Otto
2010/5/22 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@gmail.com: just copy the dih-extras jar file from the nightly should be fine Now that I've finally got a server on which to attempt to set these things up... this turns out not to be a viable solution. The extras jar does contain the

Tomcat startup script

2010-06-07 Thread Sixten Otto
So, looking at the wiki article on setting up Solr with Tomcat (http://wiki.apache.org/solr/SolrTomcat), there's a link to an attached init.d script for CentOS/RedHat/Fedora. Trouble is, the wiki won't let me access it. Even after creating an account and logging in, clicking on the link

Re: Tomcat startup script

2010-06-07 Thread Sixten Otto
On Mon, Jun 7, 2010 at 2:35 PM, Chris Hostetter hossman_luc...@fucit.org wrote: there is currently a bug with the apache wiki and attachments... https://issues.apache.org/jira/browse/INFRA-2773 Glad to know it's not just me. But does anyone have that script posted anywhere else? Sixten

Re: TikaEntityProcessor on Solr 1.4?

2010-05-21 Thread Sixten Otto
2010/5/19 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@gmail.com: I guess it should work because Tika Entityprocessor does not use any new 1.4 APIs On Wed, May 19, 2010 at 1:17 AM, Sixten Otto six...@sfko.com wrote: The TikaEntityProcessor class that enables DataImportHandler to process business

Re: TikaEntityProcessor on Solr 1.4?

2010-05-21 Thread Sixten Otto
On Fri, May 21, 2010 at 5:30 PM, Chris Harris rygu...@gmail.com wrote: Actually, rather than cherry-pick just the changes from SOLR-1358 and SOLR-1583 what I did was to merge in all DataImportHandler-related changes from between the 1.4 release up through Solr trunk r890679 (inclusive). I'm

Re: Which Solr to use?

2010-05-18 Thread Sixten Otto
On Tue, May 18, 2010 at 10:40 AM, Robert Muir rcm...@gmail.com wrote: Some discussions/voting happened and the trunk is intended to be ... more like a normal trunk. If you need features not in an official release, and are looking for a codebase with updated features, I would recommend instead

TikaEntityProcessor on Solr 1.4?

2010-05-18 Thread Sixten Otto
Sorry to repeat this question, but I realized that it probably belonged in its own thread: The TikaEntityProcessor class that enables DataImportHandler to process business documents was added after the release of Solr 1.4, along with some other changes (like the binary DataSources) to support it.

Which Solr to use?

2010-05-17 Thread Sixten Otto
I've been investigating Solr on and off as a (or even the) search solution for my employer's content management solution. One of the biggest questions in my mind at this point is which version to go with. In general, 1.4 would seem the obvious choice, as it's the only released version on that