Hey folks, Good commentary and I would encourage you to create associated tickets where applicable such that we can track such ideas and their efforts from a community project level.
Concerning building, Randy, if you could provide more details on your OS X build problems, this would be greatly helpful. I know a number of contributors have OS X machines and seem to have reasonable success so any details on your environment would be helpful in trying to track down the problem. Certainly understand the concerns over wanting things to work on a wide variety of systems as stock. This was voiced in part by https://issues.apache.org/jira/browse/MINIFI-118. We certainly have options here depending on what the target environment will support, such as more static linking which may be acceptable for larger systems running more enterprise level OSes. LMDB certainly seems like it could be an interesting candidate doing some initial glances over it and its licensing (OpenLDAP Public License) seems like a variant of a 3-Clause BSD, so it should be okay to utilize from an ALv2 concern. Definitely worth pursuing, and as mentioned in the prior thread, there are no hard and fast commitments to a particular technology but rather, especially in its early stages, to establish the interfaces and framework and provide a working implementation such that there is a place to start. Concerning the idea of integrating provenance with FlowFiles, I can certainly see the value in bundling it with the FlowFiles from the standpoint of minimizing footprint and resource utilization on device/source. One important item to also be mindful of that has come up with a number of folks looking to tackle management of dataflow is also that of limited communications and/or prohibitive cost when looking at large deployments of such agents. A separate provenance repository allows the sending of provenance events out of band when convenient or explicitly requested/needed. In another aspect on that idea, including provenance in each FlowFile could exhaust disk more quickly in the event that a means of transmission is not available. In this case, the discrete storage mechanisms could allow the purging and removal of provenance without the cost of losing data that might otherwise be able to continue being buffered. That's not to say this use case is any more valid or important, but another point of consideration in the design choices made for data/provenance storage and transmission. I think the key item of import for the effort is that there are many and widely varying use cases and situations for how this particular implementation needs to be built, deployed, and utilized but makes for some interesting discussions and design processes that should make for a rewarding challenge. Thanks for the input! On Tue, Nov 29, 2016 at 4:56 AM, Daniel Cave <[email protected]> wrote: > Since MiNiFi C++ requires completely new code (unlike the Java version), I > don't see any reason we cant deviate where it makes requirement sense. If > we move the provenance onto the flowfile, then your build issues and my > stability issues can be simplified because the local provenance repo > becomes > log only and where the local repo could be handled by a standard logging > mechanism instead. As you stated, installing additional open source > libraries in production environments is a near non-starter. > > If no one disagrees with the approach or really desperately wants to take > it > on, I'm ok with taking the action item to start working on a good transport > structure and looking at making the changes needed for it to work through > S2S. This also requires making changes to NiFi to allow for the provenance > to be added to the main NiFi repo; this is something I was planning on > doing > anyway as part of a new enterprise dp/dg engine based on NiFi I'm working > on. > > We need someone to test a reliable replacement for LevelDB (be that LMDB, > which I believe comes standard in RHEL distributions, or whatever) and > integrate it or convert the local repo to log only. I'll get to it > eventually after I make the other changes if no one else does. > > > > -- > View this message in context: http://apache-nifi-developer- > list.39713.n7.nabble.com/MiNiFi-C-Data-Provenance-and- > Related-Issues-tp14024p14040.html > Sent from the Apache NiFi Developer List mailing list archive at > Nabble.com. >
