Filed as http://issues.apache.org/jira/browse/NUTCH-142
I didn't think there was much point creating a patch for a 1 line fix :) m On 12/16/05, Mike Cannon-Brookes <[EMAIL PROTECTED]> wrote: > Wow - great responses all. > > 0.7 vs 0.8 - apologies if I'm using an old version. I'm using the > latest binary release. I'll switch to latest SVN HEAD and see how that > works in my application. > > Is there any concrete timeline on 0.8? > > I'm very glad to see the statics generally being reduced. I also > personally (!!) would remove the Nutch configuration system completely > in favour of Spring - I believe you'd get a lot more power for very > little investment of time - but I realise that's a much more drastic > step for a code-newbie to suggest :) > > The directory listing in a J2EE application is a problem. Why do you > need to get a directory listing? The way we load plugins in J2EE is to > say "find me all resources named /plugin.xml", then load each of those > XML files, then from there load the relevant classes etc as indicated. > > Our plugin system has a series of different 'plugin loaders' that > handle the different strategies, which I think might work well here. > So far we have a loader for specific files, a loader for all plugins > in directory X, a loader which scans the file system regularly, a > loader which uses the classpath as above etc. This puts the > flexibility in the hands of the developer as to the restrictions their > application will have. > > I'll get to work on that patch - back in a little. > > m > > PS Is anyone actively working on the wiki? It seems a little out of > date and there's a lot more information in the mailing list. It would > be awesome if someone would regularly trawl the mailing list for > 'tidbits' (call them "Dougs Droppings of Wisdom"?) and wiki-ise them > for new users. Mailing list archives are always a crap way to find > information. > > On 12/16/05, Doug Cutting <[EMAIL PROTECTED]> wrote: > > Mike Cannon-Brookes wrote: > > > Hey guys, > > > > Hi, Mike! Welcome. > > > > > - Classloading - I have had many problems with NutchConf due to the > > > way it loads it's resources. In a J2EE scenario, it's simply evil :) > > > Would there be any great problem with switching it's classloader to > > > Thead.currentThread().getContextClassloader() instead of the current > > > static classloader? It's a lot 'friendlier' to do it this way. I can > > > submit a patch to do this very quickly if others are keen (or anyone > > > can do it - I've done it locally, takes about 30 keystrokes!) > > > > That's not a problem. Please submit a patch. Attach it to a bug report > > (if you know how to use Jira!). > > > > > - Statics - On that issue, there are an awful lot of static classes > > > and methods around. This makes configuring and using Nutch in 'non > > > standard' ways difficult as things are hard coded together (for > > > example I can't easily swap out NutchConf to do my own configuration > > > mechanism as it's all static accesses!). Is there any interest in > > > removing / refactoring these statics out to make Nutch more flexible? > > > > Yes, that's a goal. I'd like to seriously attack it after we merge the > > mapred branch to trunk, probably next month. > > > > I made a proposal in this vein almost a year ago: > > > > http://www.mail-archive.com/[email protected]/msg00196.html > > > > Note also that mapred's JobConf is always used dynamically, so all of > > the new mapred-based code can be dynamically configured. The biggest > > thing left to fix are plugins. I think perhaps each plugin factory > > method should take a configuration. > > > > > - Plugins / physical files - Quite a lot of stuff in Nutch seems to > > > rely on physical files (for example plugins are loaded by looking for > > > the "/plugins" directory on disk IIRC). In a J2EE environment, this > > > means you can't deploy the WAR as a non-expanded WAR for example. Can > > > we switch from loading files directly to loading resources as streams? > > > This means you can load a file from the classloader regardless of > > > whether or not it exists as a physical file. > > > > The problem is that we sometimes need to list directories, e.g., to find > > out what resources are available. Is there a J2EE-safe way to to do that? > > > > Cheers, > > > > Doug > > > > > -- > ATLASSIAN - http://www.atlassian.com > -- ATLASSIAN - http://www.atlassian.com
