Aruna, In addition to considerations of how you'll store/access the provenance data (if outside the NiFi repo) you'll also need to think of the flow as acting as an 'indexer' for things you'd want to be able to reference. If you could extract features of the data as flowfile attributes then those attributes become part of the provenance record which you could then of course query. Once you have a provenance record you can trace it up/down its lineage graph. The trick is in finding a record of interest. Now, you must also be careful not to index too many things or you can have bloat in memory for flowfile attributes.
Thanks On Tue, Nov 21, 2017 at 11:03 AM, Aruna Sankaralingam <aruna.sankaralin...@cormac-corp.com> wrote: > Jeremy, > > > > We don’t have the target database info yet. I will let you know once I get > that info. > > > > Thanks > > Aruna > > > > From: Jeremy Dyer [mailto:jdy...@gmail.com] > Sent: Monday, November 20, 2017 10:43 PM > To: users@nifi.apache.org > Subject: Re: NiFi > > > > Aruna - if I’m reading your first email correctly you certainly can for > structured data. You do do this for unstructured data as well but most of > this depends on your ultimate datastore and how your data is stored. If you > don’t mind me asking what is the destination datastore? > > > On Nov 20, 2017, at 10:28 PM, Aruna Sankaralingam > <aruna.sankaralin...@cormac-corp.com> wrote: > > Joe > > > > When I googled before I sent this email, I found that we can do data lineage > with Nifi. > > > > But I don't know if it can reference back to page numbers of docs or cell > reference. > > > > Thanks > > Aruna > > > On Nov 20, 2017, at 7:49 AM, Joe Witt <joe.w...@gmail.com> wrote: > > aruna > > > > can you share pointers to nifi documentation that is leaving you unsure of > what it can do relative to your requirements? > > > > thanks > > joe > > > > > > > > On Nov 20, 2017 7:43 AM, "Aruna Sankaralingam" > <aruna.sankaralin...@cormac-corp.com> wrote: > > Hi, > > Could someone please let me know if Nifi can be used for: > > Detailed reference back to source required for any analysis (e.g., page > numbers for documents, cell reference for structured data) > > > > And also for scheduling? > > Thanks > > Aruna > > > On Nov 17, 2017, at 3:10 PM, Aruna Sankaralingam > <aruna.sankaralin...@cormac-corp.com> wrote: > > Hi, > > > > We are currently working on a RFI the purpose of which is to seek > information from the industry on the current products and services offerings > of an analytical tool > > that is a cloud-ready, fully scalable enterprise platform that ingest > multiple large structured or unstructured datasets. > > > > Couple of other desired capabilities include: > > · Detailed reference back to source required for any analysis (e.g., > page numbers for documents, cell reference for structured data) > > · Ability to bring in or ingest data manually or by a scheduling > engine > > > > Could you someone please let me know if Nifi can be used for the two desired > capabilities mentioned above and if so, can you provide more info? > > > > Thanks > > Aruna