[CODE4LIB] Solr for Internal Searching

2008-08-05 Thread Cloutman, David
Today my boss asked me to come up with a solution that would let us
index and search our intranet. I was already thinking of using Solr on
our public Web site we are building, and thought this might be a good
opportunity to knock two items off the to-do list with the same
technology. I know there was a preconference session on Solr this year,
and I have the sense that this is gaining traction in the library
community. Is there any reason why I shouldn't do this?

Thanks,

- David



---
David Cloutman [EMAIL PROTECTED]
Electronic Services Librarian
Marin County Free Library 

Email Disclaimer: http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm


Re: [CODE4LIB] Solr for Internal Searching

2008-08-05 Thread Bess Sadler

Hi, David.

I think solr is great, and I use it all the time and can highly  
recommend it. However, if what you have is mostly HTML pages, you  
might want to consider nutch (http://lucene.apache.org/nutch) instead.


Both solr and nutch are based on lucene, but nutch will give you more  
built-in tools for crawling your website. Use the right tool for the  
job and all that. :)


Bess

On 5-Aug-08, at 7:03 PM, Cloutman, David wrote:


Today my boss asked me to come up with a solution that would let us
index and search our intranet. I was already thinking of using Solr on
our public Web site we are building, and thought this might be a good
opportunity to knock two items off the to-do list with the same
technology. I know there was a preconference session on Solr this  
year,

and I have the sense that this is gaining traction in the library
community. Is there any reason why I shouldn't do this?

Thanks,

- David



---
David Cloutman [EMAIL PROTECTED]
Electronic Services Librarian
Marin County Free Library

Email Disclaimer: http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm


Re: [CODE4LIB] Solr for Internal Searching

2008-08-05 Thread Michael J. Giarlo
The nice thing about nutch is that it exposes an OpenSearch interface.
 So you can write your search-y webapps in any language that can speak
HTTP and XML, which both Java and PHP should be able to handle.  In
fact, I'd be surprised if both languages didn't already have
OpenSearch libraries.

nutch++

-Mike


On Tue, Aug 5, 2008 at 7:29 PM, Cloutman, David
[EMAIL PROTECTED] wrote:
 Thanks to both Roy and Bess, and anyone else who posts after I write
 this. I'll definitely have to look into nutch. Just to state my needs a
 little more clearly, I'm trying to keep our applications contained to
 Java and PHP solutions, if possible, as our machines are already
 configured to utilize those platforms.


 ---
 David Cloutman [EMAIL PROTECTED]
 Electronic Services Librarian
 Marin County Free Library

 -Original Message-
 From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of
 Bess Sadler
 Sent: Tuesday, August 05, 2008 4:19 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Solr for Internal Searching


 Hi, David.

 I think solr is great, and I use it all the time and can highly
 recommend it. However, if what you have is mostly HTML pages, you
 might want to consider nutch (http://lucene.apache.org/nutch) instead.

 Both solr and nutch are based on lucene, but nutch will give you more
 built-in tools for crawling your website. Use the right tool for the
 job and all that. :)

 Bess

 On 5-Aug-08, at 7:03 PM, Cloutman, David wrote:

 Today my boss asked me to come up with a solution that would let us
 index and search our intranet. I was already thinking of using Solr on
 our public Web site we are building, and thought this might be a good
 opportunity to knock two items off the to-do list with the same
 technology. I know there was a preconference session on Solr this
 year,
 and I have the sense that this is gaining traction in the library
 community. Is there any reason why I shouldn't do this?

 Thanks,

 - David



 ---
 David Cloutman [EMAIL PROTECTED]
 Electronic Services Librarian
 Marin County Free Library

 Email Disclaimer:
 http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm



Re: [CODE4LIB] Solr for Internal Searching

2008-08-05 Thread Nate Vack
I know this is code4lib, not buystuff4lib, but the Google Mini is
reputed to be rather quick, bulletproof and configurable, and starts
at $3k. For example, it works nicely with lots of file formats
(including Office documents) out of the box. And works with LDAP and
NTLM for authentication and authorization.

I suspect it'll probably be challenging to deliver a quality search
solution for a lower total cost.

Of course, this all depends on what your intranet looks like on the
inside. I've seen 'intranet' mean a things that would call for wildly
different search solutions.

So... solr is great, but this question doesn't contain nearly enough
information to answer whether it's a good fit for your task at hand.

Cheers
-Nate

On Tue, Aug 5, 2008 at 6:03 PM, Cloutman, David
[EMAIL PROTECTED] wrote:
 Today my boss asked me to come up with a solution that would let us
 index and search our intranet. I was already thinking of using Solr on
 our public Web site we are building, and thought this might be a good
 opportunity to knock two items off the to-do list with the same
 technology. I know there was a preconference session on Solr this year,
 and I have the sense that this is gaining traction in the library
 community. Is there any reason why I shouldn't do this?

 Thanks,

 - David



 ---
 David Cloutman [EMAIL PROTECTED]
 Electronic Services Librarian
 Marin County Free Library

 Email Disclaimer: http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm



Re: [CODE4LIB] Solr for Internal Searching

2008-08-05 Thread Tim Spalding
Does Google Mini facet? It seems to have a concept of collections, but
does it facet by them?

T

On Wed, Aug 6, 2008 at 12:05 AM, Bill Dueber [EMAIL PROTECTED] wrote:
 At UMich, we use space on a Google Appliance as our site search
 (different setups for internal vs. public pages) and have been pretty
 happy. I've been able to abuse the google ads space to our benefit
 -- e.g., go to http://lib.umich.edu/ and search Web Pages for grad
 (get today's hours) or 'dueber' (find me).

 On Tue, Aug 5, 2008 at 11:24 PM, Nate Vack [EMAIL PROTECTED] wrote:
 I know this is code4lib, not buystuff4lib, but the Google Mini is
 reputed to be rather quick, bulletproof and configurable, and starts
 at $3k. For example, it works nicely with lots of file formats
 (including Office documents) out of the box. And works with LDAP and
 NTLM for authentication and authorization.

 I suspect it'll probably be challenging to deliver a quality search
 solution for a lower total cost.

 Of course, this all depends on what your intranet looks like on the
 inside. I've seen 'intranet' mean a things that would call for wildly
 different search solutions.

 So... solr is great, but this question doesn't contain nearly enough
 information to answer whether it's a good fit for your task at hand.

 Cheers
 -Nate

 On Tue, Aug 5, 2008 at 6:03 PM, Cloutman, David
 [EMAIL PROTECTED] wrote:
 Today my boss asked me to come up with a solution that would let us
 index and search our intranet. I was already thinking of using Solr on
 our public Web site we are building, and thought this might be a good
 opportunity to knock two items off the to-do list with the same
 technology. I know there was a preconference session on Solr this year,
 and I have the sense that this is gaining traction in the library
 community. Is there any reason why I shouldn't do this?

 Thanks,

 - David



 ---
 David Cloutman [EMAIL PROTECTED]
 Electronic Services Librarian
 Marin County Free Library

 Email Disclaimer: http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm





 --
 Bill Dueber
 Library Systems Programmer
 University of Michigan Library




-- 
Check out my library at http://www.librarything.com/profile/timspalding