Re: Forrest as an XML repository

Ross Gardler Fri, 03 Jun 2005 03:41:05 -0700

Juan Jose Pablos wrote:

FYI:


Ricardo Beltran wrote:


I've CC'd Ricardo on this reply - please reply all.

...

My questions are: Do you think that Forrest is an appropriate framework
for this purpose? and Do you think that Lucene or
Google will do the job of indexing about (5 GB) of XML
files?

I can't comment with authority on the suitability of Google or Lucenefor this as I have no experience. My gut is telling me that this is notthe optimal solution.

I do have a project that has around 8Gb of dynamic data being publishedvia the Forrest webapp.

The solution I employed, and one that appears to be working well, was tohave the data in an XML enabled database, in this case we used Oracle,but we have successfully used XIndice and eXist in similar, smaller,projects in the past. I wrote a custom generator to retrieve the datafrom the DBMS.

It should be noted that Cocoon has some database components that can beutilised (there is the results of some early experiments of I did withthese components in the whiteboard pluginorg.apache.forrest.plugin.Database). The reason I never completed workon this plugin was not a problem with it, but additional requirementsthat made it easier to build a custom generator (our requests were alsodependant on live data from sensor readings over an RS232 port).

The system has now been running for about 3 months and we are very happywith it. Because we are using a Database server as the repository wehave all the indexing and optimisation provided by that server. We alsohave the benefit of a very expressive and mature search language.

Of course, this solution requires that you run the system dynamically.Using Google to index your site would allow you to run statically.Trying to build a static site from 5GB of data would be a wonderfulstress test, if you do this please report your findings to us.


Ross

Re: Forrest as an XML repository

Reply via email to