On Jun 29, 2005, at 4:37 PM, Chris Lu wrote:
How is your crawler is done?
I saw SF.net searches several types of documents, like "People",
"Freshmeet.net", "Site Doc". Are they all from database?
We don't crawl per se, we use triggers in the database to spool
changes to a table which is then processed in batches. The searches
which are searching data stored on SourceForge.net all work that
way. The Freshmeat.net search, on the other hand, actually posts the
search to Freshmeat.net. We have no control over their search
system, though, that search type is provided as a convenience for users.
A little bit marketing here:
I am working on an off-the-shelf product called DBSight. It's
basically Database+Lucene+Query Display. It can do most of the things
you mentioned(no offense to your great work). And it is scalable to
enterprise level database. Basically it can be attached to any
database and create a search engine rapidly.
It can even support subscriber(s)+publisher mode. So instead of a
powerful machine, you can use several ordinary computer to create a
search farm.
Configurations, inlucding Analyzers, are configurable through web UI.
check out this demo:
http://search.dbsight.com
I took a quick look at your tool and it seems pretty robust. Perhaps
I'll have time to fully evaluate it at some point.
Thanks,
--Chris Conrad
SourceForge.net Engineer
Chris Lu
On 6/29/05, David Spencer <[EMAIL PROTECTED]> wrote:
Chris Conrad wrote:
I know I've been asked before for a description of how
SourceForge.net
is using Lucene. I wrote a blog entry about it and thought people
might be interested in seeing at a high level how it was designed.
Take a look at http://blog.dev.sf.net. Any comments are welcome.
Thanks for the writeup and nice to see that you guys are using
Lucene.
It would be interesting to know what Analyzer you're using, and as
your
blog entry says you're having some wierd problem, well, I suspect the
most common source of strange Lucene behavior is that the "wrong"
Analyzer is used.
Also, out of curiosity: what's the peak load the search system gets?
thx,
Dave
--Chris Conrad
SourceForge.net Engineer
--------------------------------------------------------------------
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]