Joe/Carlos, Thanks for the replies. Here is a little more information that hopefully clarifies the problem statement. We have System-A and System-B. Each time we bring on a new User, they get a new Organization (this will be the unique id) in both System-A and System-B. The new User will go in and set up their objects for their Organization (the schemas unique to the user). Say the first schema they set up is for a Customer object. They start with a base schema for a Customer that is predefined.
The Base Schema and Mappings System-A Base Schema First_Name Last_Name System-B Schema Full_Name Base Mapping First_Name + Last_Name -> Full_Name ---- First User's Schema and mapping for their Organization: System-A Organization 1 First_Name Last_Name System-B Schema Full_Name Organization 1 Mapping First_Name + Last_Name -> Full_Name First user makes no changes to the schema or mapping. ---- Second user comes onboard. They don't like the base schema, so they make theirs custom for their Organization. They do: System-A Organization 2 First_Name Middle_Name Last_Name System-B Schema Full_Name Organization 1 Mapping First_Name + Middle_Name + Last_Name -> Full_Name I should mention that System-A is the only modifiable schema; System-B is fixed. The only thing that can change is the System-A schema and its mapping to System-B and will be unique to each Organization. Hope this clarifies a bit more. Cheers, Ryan H On Thu, Oct 4, 2018 at 2:39 PM Joe Witt <[email protected]> wrote: > Ryan > > I am not entirely sure I fully understand the scenario so read the > following with a bit of caution. > > But the gist I think i'm reading is that system A generates data > against a given schema referenceable with some guid say 'GUID1'. For > every GUID1 from System A there is a similar/but possibly different > schema in System B referenceable again by some guid say GUID1_B. > > Are those base schemas you referenced say GUID1 and GUID1_B compatible > in that all fields of import in GUID1 schema will be in GUID1_B > schema? As-in can you treat GUID1_B schemas as a superset of GUID1 > schemas? > > In any event, it is very possible that you can achieve this using the > Record processors and Record oriented controller services. > > You can 'read' data from system A in NiFi using record readers that > reference schemas from systemA and 'write' data using system B schemas > in NiFi. > > The record oriented processors combined with their pluggable record > readers and writers allows this to happen. The schema for the reader > and writer can be different and the processors will map things over by > field names/types. That leaves the problem of 'how to access' the > schema information so NiFi knows what to do during read/write phases. > You can either ensure all these schemas are in the nifi schema > registry and can be looked up by some well established name and naming > scheme. You could also do it via a SQL lookup if you have the schemas > in some database (similar to what Carlos might be suggesting but > without Jolt). Or some other way.. > > Now, if your mapping logic from System A schema to System B schema is > more complex then you'll want something like JOLT most likely to help > manage those transforms. > > You can also do some routing on the requests from System A to > specialized JOLT mappers for each case of converting to System B > schemas. That will end up in a lot of config but it is likely you can > paramaterize this well using versioned flows and expression langauge > statements in the variable registry. > > Thanks > Joe > On Thu, Oct 4, 2018 at 1:35 PM Ryan H <[email protected]> > wrote: > > > > Hi All, > > > > I have been working on an integration between two systems with NiFi in > the middle as the integration point. Basically when an event happens on > System-A, such as a record update, those changes need to be propagated to > System-B. For the first use case, I have set up a data flow that listens > for incoming requests from System-A, performs a mapping, the sends the > mapped data to System-B. > > > > Generalized Flow for "Create_Event" (dumbed down significantly): > > System-A "Create_Event" -> HandleHTTPRequest -> JoltTransformJSON -> > InvokeHTTP -> System-B "Create_Event" > > > > This works great for the first case with a predefined mapping in > JoltTransformJSON. Now I want to generalize it so that the same data flow > can be used for all Create_Event's on System-A. > > > > Here is where the issue comes in. There is a base schema for System-A > that has a base mapping to the base schema in System-B. Users of the System > have the ability to "extend" the base schema to add/remove fields and > modify the base mapping. So each time the Create_Event happens, the mapping > that is used should be the unique mapping spec associated to that user > (call it a GUID that comes along with the request). > > > > The data flow is the exact same for all Create_Events, except for the > mapping, which will be unique to the user. > > > > Does anyone know of a way to load up a different mapping to be used on a > per-request basis? I used JoltTransformJSON just as a proof of concept, so > it does not need to be a Jolt spec and can be modified to meet the needs of > whatever would work for this. > > > > I started to look into schema registry, but kind of got lost a bit and > wasn't sure if it could be applied to this situation. > > > > Any Help is, as always, greatly appreciated. > > > > > > Cheers, > > > > Ryan H. >
