Hello Balaji Thank you for your interest in Apache NiFi and indeed the community is helpful and can answer a lot of questions. These questions are pretty high level suggesting you are still in the early phases of learning about NiFi. Please take a look through the linked documentation below.
NiFi does support event based scheduling for a number of processors. Whether it is supported for a given processor depends on whether it makes sense for that processor and whether the developer of it activated that feature. If it is available then it can be chosen via the UI or when configuring the processor via the REST API directly. NiFi can certainly be configured to pull from multiple tables within a single or across multiple databases at once. Consider using QueryDatabaseTable processor(s) for this. You can setup controller services for the necessary database connection pools as well. Regarding transformations on the data yes there are quite a few processors to do this as well. For systems which offer a nice way to initiate a job given some data and for which NiFi could asynchronously come back later and check on the results then it might well be a perfectly fine way to manage those jobs. Yes NiFi can be used for workflow management but this is an extremely broad topic area so it is best to be specific about a particular use case. In NiFi you're building flows which are essentially state machines. Inherently then to have reached a certain state in the flow graph means you've been through other preceding states already which covers it at the level you've described. I'd encourage you to break down your questions/ideas into more focused items so the community can more meaningful assist you with your evaluation. https://nifi.apache.org/docs.html - Overview - User Guide - Walk through some of the processors. https://cwiki.apache.org/confluence/display/NIFI/Apache+NiFi https://cwiki.apache.org/confluence/display/NIFI/FAQs Thanks Joe On Tue, Apr 12, 2016 at 10:27 AM, Balaji K Hari <[email protected]> wrote: > Hi Team, > > Based on the project requirements, I was looking at different features > included in Apache NIFI and found that this would be the good way to interact > with the Development team who have developed NIFI and are looking for > suggestions/inputs from the User community to improvise the product and also > it is a great medium where the users who are using this NIFI would get the > valuable inputs from the developers for their requirements. > > Need your assistance/inputs on the below requirements and how these can be > implemented in NIFI to achieve the solution. > > > è I have observed that, Event Based Scheduling/Any Trigger Based Scheduling > is yet to be included in the latest NIFI product. Any > workarounds/alternatives to achieve this? > > è Can Spark/Hive Jobs can be scheduled on time basis and also executed > through NIFI? If Yes, please suggest how can we do this? > > è Can we get the data from multiple tables of Oracle/SQL Server/Teradata and > put directly in S3/HDFS and also directly to RedShift/Any database? If Yes, > please suggest how can we do this? > > è Also can we do the transformations/manipulations on the data while moving > it to S3/HDFS from RDBMS databases? If Yes, please suggest how can we do this? > > è Can we do the validations and also find the duplicate data/records before > you put the data into S3/HDFS. For example, I have moved the data from RDBMS > tables into S3 and as part of daily loads, I need to check whether any > duplicate records are present in the new load and need to remove those > records while data movement itself. Please provide your inputs how can we do > this? > > è Also can you provide valuable inputs on how can we achieve the workflow > execution dependency i.e. For example, I have designed one workflow and > based on this 1st workflow execution completion, I need to start the second > workflow else need to start another workflow. Can this be achieved in NIFI? > > It would be really helpful and appreciated on the above inputs, as you would > be the best team who can help the us the solutions/workarounds in using the > NIFI product as it is been identified as a good user friendly product for > Data Ingestion/movement. > > Looking forward for your reply with the requested suggestions and solutions. > Thanks in Advance!!!! :):) > > Regards, > _______________________________________________________________________ > Balaji KNV_Hari > Technical Architect > > This message contains information that may be privileged or confidential and > is the property of the Capgemini Group. It is intended only for the person to > whom it is addressed. If you are not the intended recipient, you are not > authorized to read, print, retain, copy, disseminate, distribute, or use this > message or any part thereof. If you receive this message in error, please > notify the sender immediately and delete all copies of this message.
