> On 19.10.2016, at 20:10, Marshall Schor <m...@schor.com> wrote: > > 1) Specifying "object A" to a component. My thinking did not go beyond what > is > done today for external shared resources. UIMA provides an > ExternalResourceDescription as part of a component; this is eventually fed to > UIMA's "produceResource" methods to produce an instance of the resource. > > So, I was thinking that you would specify "object A to a component" by just > including the external resource description as part of the component's > metadata. > > 2) How to specify an "object B" as a parameter to "object A": > 2a) Object A gets to define a key (or keys) for its parameters. Let's say > uses > "myObjB" as the key. > Object A gets to decide how to interpret the value for this key coming from > the > external settings file. (Not architected by UIMA). > 2b) At a time chosen by Object A, when object A is "running", it reads the > value > of the key "myObjB" from the external settings file, and then interprets this > in > any way it chooses, and then uses that to define Object B (again, this would > be > arbitrary, not architected by UIMA)
I would still like to see an example of how these parameters that are *not* "external resource parameters" but "overrides" are specified in code. For external resources, based on the current "external resource parameters" mechanism, uimaFIT defines a convenient way of composing resources and components, specifically providing parameter values *locally* to each single declared resource, i.e. there is *no chance for conflict* between e.g. multiple instances of the same resource type being used in a pipeline. Below is an example how external resources (even nested ones) can be bound to an analysis engine. The parameter values of the external resources are provided locally for each resource. Mind that calling createExternalResourceDescription twice for the same class creates two distinct external resource instances of that class which can be bound independently. ---- createEngineDescription(ExtractFeaturesConnector.class, ExtractFeaturesConnector.PARAM_OUTPUT_DIRECTORY, outputPath, ExtractFeaturesConnector.PARAM_DATA_WRITER_CLASS, WekaDataWriter.class, ExtractFeaturesConnector.PARAM_LEARNING_MODE, Constants.LM_SINGLE_LABEL, ExtractFeaturesConnector.PARAM_FEATURE_MODE, Constants.FM_DOCUMENT, ExtractFeaturesConnector.PARAM_ADD_INSTANCE_ID, true, ExtractFeaturesConnector.PARAM_FEATURE_FILTERS, new String[] {}, ExtractFeaturesConnector.PARAM_IS_TESTING, false, ExtractFeaturesConnector.PARAM_FEATURE_EXTRACTORS, ==> asList(createExternalResourceDescription(EmoticonRatio.class, EmoticonRatio.PARAM_UNIQUE_EXTRACTOR_NAME, "123"), ==> createExternalResourceDescription(NumberOfHashTags.class, NumberOfHashTags.PARAM_UNIQUE_EXTRACTOR_NAME, "1234")))); --- My understanding is that you want to deprecate the existing "parameters" mechanism in external resources in favor of the "external override" mechanism. Hence, I would like to know how to implement creating resource descriptions, binding them, and setting their parameters would be done programmatically relying just on the "override" mechanism and not on the existing "parameters" mechanism. More on this in the last section below. > 3) how to set non-String parameters? Both the external settings and the > normal > UIMA configuration parameter settings (I'm thinking of the XML descriptor) > represent these as strings. So the number 1.0 is represented as the string > "1.0", and the code that gets configuration parameter settings is responsible > for type conversions, for instance, converting the string to the declared > configuration parameter type. > > For accessing directly external settings, there is no architected place for > specifying the "type" of the parameter, other than the configuration > declarations (which could be used for simple UIMA types only); the external > settings API returns just the string (or an array of strings, which is > supported) to the caller, and it's up to the caller to then do whatever > interpretation of this string value is desired (not architected by UIMA). The ConfigurableDataResourceSpecifier uses a ResourceMetaData object for parameters. ResourceMetaData supports non-String parameter values via ConfigurationParameterDeclarations and ConfigurationParameterSettings. Types of parameters are declared in ConfigurationParameterDeclarations and the framework handles the conversion between external String form and internal parameter values. It is not up to the component or resource to implement a conversion mechanism for each parameter. > 4) re: disambiguating parameters for multiple instances of a shared resource. > UIMA today has the ability to have multiple instances of a shared resource, > e.g. > a "dictionary" that is parameterized by "language"; > multiple instances of these can be loaded. The "get resource" api for this > includes specifying the parameter(s) to select the proper one, and each > instance > that is created gets a initial "load" call whose argument can identify the > instance. > > So, (not architected by UIMA) the implementation could, for example, define a > set of "keys": e.g. > my_thesaurus_en, my_thesaurus_de, ... for some parameters that are dependent > on > a language code. > > Beyond this, External Resources doesn't support multiple instances, and I had > not considered extending this (as part of this discussion, which was about how > to read configuration parameters). If I understand you correctly, you want that the implementer of a resource defines some naming convention to ensure that override names can be manually associated with resources configured in specific ways, e.g. (pseudocode) ---- setOverride "de_dictionary" = "german.lexicon" setOverride "en_dictionary" = "english.lexicon" class DictionaryResource { def initialize(UimaContext ctx) { def lang = ctx.getParameter("lang"); def lexicon = ctx.getOverride("${lang}_dictionary"); loadLexicon(lexicon); } } ---- If I have understood it correctly, that looks like a nice option of working e.g. with multi-language scenarios. We have used external resources more in the context of machine learning, specifically to model feature extractors. Here, we define multiple instances of external resources, e.g. to obtain n-grams of different sizes. ---- // Defining two instances of the NGramExtractorResource with // different parameters. def unigrams = createResource(NGramExtractorResource.class, NGramExtractorResource.PARAM_SIZE, 1); def bigrams = createResource(NGramExtractorResource.class, NGramExtractorResource.PARAM_SIZE, 2); def analysisEngine = createEngine(Analyzer.class, Analyzer.KEY_EXTRATORS, asList(unigrams, bigrams)); ---- To that end, uimaFIT introduces a custom external resource type "ResourceList" (extends Resource_ImplBase) [1] which is implicitly created in the call above. So the "unigrams" and "bigrams" bind to the implicitly created "resource list" and the "resource list" binds to the analysis engine. Can/should the setting of PARAM_SIZE in the example above be substituted using the "external override" mechanism? Cheers, -- Richard [1] https://svn.apache.org/repos/asf/uima/uimafit/trunk/uimafit-core/src/main/java/org/apache/uima/fit/internal/ResourceList.java