On Jun 29, 2005, at 4:37 PM, Chris Lu wrote:

How is your crawler is done?
I saw SF.net searches several types of documents, like "People",
"Freshmeet.net", "Site Doc". Are they all from database?

We don't crawl per se, we use triggers in the database to spool changes to a table which is then processed in batches. The searches which are searching data stored on SourceForge.net all work that way. The Freshmeat.net search, on the other hand, actually posts the search to Freshmeat.net. We have no control over their search system, though, that search type is provided as a convenience for users.


A little bit marketing here:
I am working on an off-the-shelf product called DBSight. It's
basically Database+Lucene+Query Display. It can do most of the things
you mentioned(no offense to your great work). And it is scalable to
enterprise level database. Basically it can be attached to any
database and create a search engine rapidly.

It can even support subscriber(s)+publisher mode. So instead of a
powerful machine, you can use several ordinary computer to create a
search farm.

Configurations, inlucding Analyzers, are configurable through web UI.

check out this demo:
http://search.dbsight.com

I took a quick look at your tool and it seems pretty robust. Perhaps I'll have time to fully evaluate it at some point.

Thanks,
--Chris Conrad
SourceForge.net Engineer


Chris Lu

On 6/29/05, David Spencer <[EMAIL PROTECTED]> wrote:

Chris Conrad wrote:

I know I've been asked before for a description of how SourceForge.net
is using Lucene.  I wrote a blog entry about it and  thought people
might be interested in seeing at a high level how it  was designed.
Take a look at http://blog.dev.sf.net.  Any comments  are welcome.


Thanks for the writeup and nice to see that you guys are using Lucene.

It would be interesting to know what Analyzer you're using, and as your
blog entry says you're having some wierd problem, well, I suspect the
most common source of strange Lucene behavior is that the "wrong"
Analyzer is used.

Also, out of curiosity: what's the peak load the search system gets?

thx,
  Dave



--Chris Conrad
SourceForge.net Engineer


-------------------------------------------------------------------- -
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to