Thanks for your help! I would consider seriously your opinions, by now we aren't sure to start this project because we do not have the resources yet, but if I use forrest to build it I will stay in contact with you trough the mailing list
Best regards Ricardo Beltran [EMAIL PROTECTED] On 03/06/05, Ross Gardler <[EMAIL PROTECTED]> wrote: > Juan Jose Pablos wrote: > > FYI: > > > > Ricardo Beltran wrote: > > I've CC'd Ricardo on this reply - please reply all. > > ... > > >> My questions are: Do you think that Forrest is an appropriate framework > >> for this purpose? and Do you think that Lucene or > >> Google will do the job of indexing about (5 GB) of XML > >> files? > > I can't comment with authority on the suitability of Google or Lucene > for this as I have no experience. My gut is telling me that this is not > the optimal solution. > > I do have a project that has around 8Gb of dynamic data being published > via the Forrest webapp. > > The solution I employed, and one that appears to be working well, was to > have the data in an XML enabled database, in this case we used Oracle, > but we have successfully used XIndice and eXist in similar, smaller, > projects in the past. I wrote a custom generator to retrieve the data > from the DBMS. > > It should be noted that Cocoon has some database components that can be > utilised (there is the results of some early experiments of I did with > these components in the whiteboard plugin > org.apache.forrest.plugin.Database). The reason I never completed work > on this plugin was not a problem with it, but additional requirements > that made it easier to build a custom generator (our requests were also > dependant on live data from sensor readings over an RS232 port). > > The system has now been running for about 3 months and we are very happy > with it. Because we are using a Database server as the repository we > have all the indexing and optimisation provided by that server. We also > have the benefit of a very expressive and mature search language. > > Of course, this solution requires that you run the system dynamically. > Using Google to index your site would allow you to run statically. > Trying to build a static site from 5GB of data would be a wonderful > stress test, if you do this please report your findings to us. > > Ross >
