: I have a project Search engine. My part is : : 1. Creat dataBase for Search engine : 2. indexing and searching
: My friend told me need indexing all webpages then save all files were : indexed in DataBase. : then I don't know which is right. databases can build internal "indexes" on tables to make certain queries faster ... so if you have a database of webpages you can build an index on something like a "size" field to make searching for pages by size faster. some databases have a feature called a "fulltext" index that can be built on text colums to make searching for words faster them doing simple "LIKE" queries. This can work in some use cases, but these database "fulltext" indexes tend to be very limiting and not easy to customize. based on what you've described, a couple of Lucene subrpojects might be useful to you... http://lucene.apache.org/nutch/ Nutch is specificly designed to crawl and index webpages. http://lucene.apache.org/solr/ Solr is a search "application" that let's you index/query content using any language over HTTP. It comes with a DataImportHandler plugin that lets you automaticly index databases using configuration to describe how to fetch the logical contents of each "document" http://lucene.apache.org/java/ Lucene-Java is the underlying search library used in both Nutch and Solr, if you want to custom build search based logic you can use this library. As you mentioned, there is also a Hibernate project for integrating with Lucene. if you have followup questions about any of those 3 subprojects, please consult the specific user mailing list for the project that you are interested in. -Hoss
