> From: Yaser Al Masri [mailto:[EMAIL PROTECTED]] > > Hi, > > I've been looking to the implementation of Lucene inside Cocoon, and it's > great what this combo is able to deliver. Things will continue to work > perfectly until database shows. The 'like' and 'contains' functions in SQL > and XPATH queries respectively cannot make much use of the indexing supplied > by the corresponding database because they're a sub-string search, while > search engines will help here because of the implementation of full-text > string searches, so I guess the justification for using search engines > rather than query statements for this case is valid. > > The problem here ASAIK is that to use lucene you have to supply it with > Directory object that points to the location of your data files, which > doesn't exist here in case of database.
What do you mean here? Index is built independently of wherever your data is stored, and I do not understand your conclusion. > I thought I could mimic the > implementation of Lucene inside cocoon but for the Xindice database instead > of the spreading files, It already works on the content stored anywhere; search/indexing mechanism does not care about data storage. > so I started studying the crawler behavior and if it > could be attained with the help of the pseudo protocol implementation for > Xindice, but I stuck there with the fact that I'm only mapping URLs to > underlying database not to actual directories and files seen to the indexer. Indexer *does* *not* work on files. > On more problem appeared to be there is the information retrieved from the > searcher. For many cases that native XML databases help in, it would be of > no use to the use to get a URL for his search rather that a brief > description to the document he searched (like in searching for items in > catalog where the description and the image matters more than the URL that > links to it). I believe it is possible to have document abstract with Lucene. You need to investigate Lucene capabilities and augment indexer with this feature. > So I thought if Lucene could be adapted to retrieve documents > inside collections rather than URL. It is not a Lucene concern to retrieve documents; use components specializing in this, like CIncludeTransformer. > Can anybody give me an insight on this issue? Are there any future plans to > include search support for XML and other databases? Again: Search engine does not care where your resources are stored. Assuming this, search mechanism already have support for "XML and other databases". > How do you think the > support of XQuery language will help in the search process? It's no help here. It's different domain. Regards, Vadim --------------------------------------------------------------------- Please check that your question has not already been answered in the FAQ before posting. <http://xml.apache.org/cocoon/faqs.html> To unsubscribe, e-mail: <[EMAIL PROTECTED]> For additional commands, e-mail: <[EMAIL PROTECTED]>