Hi James, On Mon, Jun 13, 2011 at 11:32 PM, James Kosin <[email protected]> wrote:
> On 6/13/2011 10:23 PM, [email protected] wrote: > > Hi, > > > > Currently we only have implemented custom feature generators that we can > > pass from command line only for NameFinder, but it would be very nice to > > have it for all tools. > > The Thai sentence detector customization is nice and simple, but to do > > something for other languages the user would need to branch the code. We > > should allow users to pass a factory class name from command line. Maybe > we > > could do it for every tool that doesn't use sequence feature generator. > Also > > would be nice to save the factory class name to the model to make sure we > > are using the same feature generator during runtime and evaluation. > > > > What do you think? Maybe you have thought a better solution for that. > > > > Thanks > > William > > > William, > > We discussed various options, unfortunately, most involved some security > risk for the Java engine; including allowing the saving of the actual > feature generator constructor itself to the model. Maybe the XML option > may be a better route for the long run. We could even save the copy of > the XML document in the model itself. But again that opens us up for > issues if someone writes bad XML to cause issues. > Yes, it is very nice with the NameFinder because we can reuse code using the XML descriptors. > > Maybe, we could have the feature generator a generic class that needed a > constructor. Then each implementing language could have a new > constructor that correctly built the feature generator. Unfortunately, > it means a change would break any models. > I can't see why it would break the models. We could by default use the current feature generators. If we use factory to create the feature generator, the user is free to create it using any resource (another dictionary implementation for example) We may need to re-open the issue when Jorn comes back or at least get > another discussion going so we can try and weed out the issues with the > options available. > Thanks William
