hi, As I'm not totally happy with the Crawler, Indexer component interfaces I want to address issues here:
Today CocoonCrawler exposes: void crawl(URL), and Iterator iterator(); crawl sets the base url, and iterator() delivers one more URL reachable from the base url. I have some head-aches using URL objects in the commandline environment. The only simple possibility is to use file: URLs which implicits storing the xml document which has been crawled to the filesystem. But storing it to the filesystem I want to avoid for sake of performance. Thus I was thinking changing the interface to: void crawl(Source) , and Iterator iterator(); Thus working with Source objects instead of URL objects. The LuceneCocoonIndexer should also change from using URL to using Source. The main reason for this change is implementing crawling and indexing today works only using the http: protocol. If you want to index xml documents of the local cocoon, or if you want to create an index in the command line version of Cocoon, you may not be able to use the http protocol. Thus I was thinking about using Source. Perhaps someone having a broader, and more detailed understanding of the Cocoon internas could help me a bit. bye bernhard --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]