[
https://issues.apache.org/jira/browse/UIMA-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Klügl updated UIMA-4464:
------------------------------
Component/s: ruta
> AE string configuration parameters are trimmed in the CPE and when XML
> serializing
> ----------------------------------------------------------------------------------
>
> Key: UIMA-4464
> URL: https://issues.apache.org/jira/browse/UIMA-4464
> Project: UIMA
> Issue Type: Bug
> Components: ruta
> Reporter: Mario Juric
>
> These are my findings so far:
> Using the new gapText parameter in UIMA Ruta HTMLConverter I noticed that the
> string is trimmed in the pipeline aggregation process e.g. “ . “ ends up
> as “.” in the pipeline and when writing the pipeline to XML. I don’t think it
> has anything to do with the HTMLConverter in particular. We use UIMAfit to
> construct the aggregated analysis engine description but I don’t know where
> this trimming exactly occurs. I was also able to run a small example pipeline
> where the trim did not happen, which was a bit of a surprise.
> The trimming is as such not a technical issue for me right now but I felt it
> might become important in some other case. I just noticed it when I added
> ekstra spaces to improve readability of my output. Initially I thought it was
> the HTMLConverter but when I inspected it then I could see that it had
> happened somewhere before configuration parameter initialisation.
> I then inspected the UIMAfit generated descriptor right after creation. The
> value was not trimmed at that point. Later during runtime initialisation
> without doing any XML serialization this time, the value is trimmed inside
> ConfigurationManagerImplBase::getConfigParameterValue right after the lookup
> operation (used debugger for value inspection). This was inside a UIMA core
> component though but the trim occurs somewhere between descriptor creation
> and AE initialisation. Seems this is not an UIMAfit issue afterall.
> I did a small example app where the HTMLAnnotator and HTMLConverter
> descriptors were also aggregated before execution but here the trimming did
> not materialise at runtime but only in the serialised XML. Then it occurred
> to me that my example used the SimplePipeline whereas our main application
> uses CPE. I then switched to the SimplePipeline and the trimming was now gone
> there as well. Seems that trimming only happens inside the CPE and when XML
> serialising the pipeline.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)