Steve

Thank you very much for desire to contribute and such a detailed explanation of 
your contribution.
I left some comments in line [OLEG], so let us know what you think.

Cheers
Oleg

> On Mar 7, 2017, at 10:17 AM, Steve Lawrence <[email protected]> wrote:
> 
> We have developed a NiFi processor that uses XMLCalabash [1] to add
> support for XProc [2] processing. XProc is an XML transformation
> language that defines and XML pipeline, allowing for complex validation,
> transformation, and routing of XML data within the pipeline, using
> existing XML technologies such as RelaxNG, Schematron, XSD Schema,
> XQuery, XSLT, XPath and custom XProc transformations.
> 
> This new processor is mostly straightforward, but we had some questions
> regarding the specific implementation and the handling of non-thread
> safe code. The code is available for viewing here:
> 
> 
> https://opensource.ncsa.illinois.edu/bitbucket/projects/DFDL/repos/nifi-xproc/browse
> 
> In this processor, a property is created to provide an XProc file, which
> defines the pipeline input and output "ports". XML goes into an input
> port, goes through the pipeline, and one or more XML documents exit at
> specified output ports. This NiFi processor maps each output port to a
> dynamic NiFi relationship. It does this mapping in the
> onPropertyModified method when the XProc file property is changed. This
> method also stores the XMLCalabash XRuntime and XPipeline objects (which
> do all the pipeline work) in volatile member variables to be used later
> in onTrigger. The members are saved here to avoid recreating them in
> each call to onTrigger. Is this an acceptable place to do that? It seems
> this normally happens in an @OnScheduled method or in the first call to
> onTrigger, however the objects must be created in onPropertyModified to
> get the output ports, so this does avoid recreating the same objects
> multiple times.
[OLEG] Without getting into more details, both approaches are acceptable. 
However assigning values in onTrigger()in certain cases is more preferable. 
Those cases primarily deal with obtaining references to a remote resource 
(i.e., connection factory, socket etc) and for those cases exception handling 
is much simpler. I can definitely elaborate further if need to and point to a 
few examples where we do that, but it appears that it is not the case for you, 
so your current approach seems acceptable. And as far as multi-threading for 
onTrigger(), such assignments are done in a typical synchronized block with 
null check.

> Also note that the same objects are created in the
> XML_PIPELINE_VALIDATOR but are not saved due to the validator being
> static, so there is already some duplication. Is there a standard way to
> avoid duplication/is this an acceptable way to handle this?

[OLEG] Not fully understand the question, but keep in mind that regardless of 
the amount of threads, there is only one instance of the processor at any given 
time, so any reference held by such instance is essentially a singleton as 
well. Does that help?
> 
> The other concern we have is that the XPipeline and XRuntime objects
> created by XML Calabash are not thread safe. To resolve this issue, the
> processor is annotated with @TriggerSerially. Is this the correct
> solution, or is there a some other preferred method. Perhaps ThreadLocal
> or a thread safe pool of XPipeline objects is preferred?

[OLEG] Definitely not thread local since there is no guarantee that you will 
get the same thread or a particular thread on subsequent invocation. The 
@TriggerSerially is obviously the most defensive way to avoid collisions. That 
said I probably need to better understand the issue. However off the top of my 
head one way of ensuring the correctness for such scenarios is to maintain a 
Map of such objects as an instance variable (like a pool) where key is 
something that would ensure that you always get the correct object.
> 
> 
> Lastly, is this something the devs would be interested in pulling in
> NiFI, and if not, what could be changed to achieve this? The code is
> licensed as Apache v2 and we would be happy to contribute the code to
> NiFi if deemed acceptable.

[OLEG] This is probably the most difficult question to answer since immediate 
answer is we don’t know ;) Only the community can decide. So what I would 
suggest is to raise a JIRA - https://issues.apache.org/jira/browse/NIFI and 
submit a PR for it and see if it gets any traction. Further more we are 
currently working on the concept of the Extension/Artifact Registry to 
accommodate growing request for more NiFi components. 
> 
> Thanks,
> - Steve
> 
> [1] http://xmlcalabash.com/
> [2] https://www.w3.org/TR/xproc/
> 

Reply via email to