Hi, Any advice on ‘best’ architectural approach whereby some processing function has to be applied to every flow file in a dataflow with some (possible) output based on flowfile content. e.g. inspect log files for specific ip then send message to syslog
approach 1 Spark Output port from NiFi -> Spark listens to that stream -> processes and outputs accordingly Advantages – scale spark job on Yarn, decoupled (reusable) from NiFi Disadvantages – adds complexity, decoupled from NiFi. Approach 2 NiFi Custom processor -> PutSyslog Advantages – reuse existing NiFi processors/ capability, obvious flow (design intent) Disadvantages – scale?? Any comments/ advice/ experience of either approaches? Thanks Conrad SecureData, combating cyber threats ______________________________________________________________________ The information contained in this message or any of its attachments may be privileged and confidential and intended for the exclusive use of the intended recipient. If you are not the intended recipient any disclosure, reproduction, distribution or other dissemination or use of this communications is strictly prohibited. The views expressed in this email are those of the individual and not necessarily of SecureData Europe Ltd. Any prices quoted are only valid if followed up by a formal written quote. SecureData Europe Limited. Registered in England & Wales 04365896. Registered Address: SecureData House, Hermitage Court, Hermitage Lane, Maidstone, Kent, ME16 9NT
