Repository: nifi-minifi-cpp Updated Branches: refs/heads/MINIFI-77 7cf8199a7 -> 4d4fd963f (forced update)
MINIFI-77 Initial draft of the Quickstart guide in the README. Adding additional information to the quickstart guide and README. Removing unneeded properties within minifi.properties. Project: http://git-wip-us.apache.org/repos/asf/nifi-minifi-cpp/repo Commit: http://git-wip-us.apache.org/repos/asf/nifi-minifi-cpp/commit/4d4fd963 Tree: http://git-wip-us.apache.org/repos/asf/nifi-minifi-cpp/tree/4d4fd963 Diff: http://git-wip-us.apache.org/repos/asf/nifi-minifi-cpp/diff/4d4fd963 Branch: refs/heads/MINIFI-77 Commit: 4d4fd963fb5bbf441e79b4b50e4a798fa1aebd5f Parents: 63d2358 Author: Aldrin Piri <[email protected]> Authored: Fri Aug 5 08:56:12 2016 -0400 Committer: Aldrin Piri <[email protected]> Committed: Mon Aug 8 00:30:40 2016 -0400 ---------------------------------------------------------------------- README.md | 192 +++++++++++++++++++++++++++++--------------- conf/minifi.properties | 159 +----------------------------------- 2 files changed, 127 insertions(+), 224 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/nifi-minifi-cpp/blob/4d4fd963/README.md ---------------------------------------------------------------------- diff --git a/README.md b/README.md index 233caf4..4e63b7e 100644 --- a/README.md +++ b/README.md @@ -18,10 +18,135 @@ MiNiFi is a child project effort of Apache NiFi. This repository is for a nativ ## Table of Contents +- [Features](#features) +- [Caveats](#caveats) +- [Requirements](#requirements) +- [Getting Started](#getting-started) +- [Getting Help](#getting-help) +- [Documentation](#documentation) - [License](#license) -## License +## Features + +Apache NiFi - MiNiFi C++ is a complementary data collection approach that supplements the core tenets of [NiFi](http://nifi.apache.org/) in dataflow management, focusing on the collection of data at the source of its creation. The C++ implementation is an additional implementation to the one in Java with the aim of an even smaller resource footprint. + +Specific goals for MiNiFi are comprised of: +- small and lightweight footprint +- central management of agents +- generation of data provenance +- integration with NiFi for follow-on dataflow management and full chain of custody of information + +Perspectives of the role of MiNiFi should be from the perspective of the agent acting immediately at, or directly adjacent to, source sensors, systems, or servers. + +## Caveats +* 0.0.1 represents the first release, APIs and interfaces are subject to change +* Build and usage currently only supports Linux and OS X environments. Providing the needed tooling to support Windows will be established as part of [MINIFI-34](https://issues.apache.org/jira/browse/MINIFI-34). +* Currently, provenance events are not yet generated. This effort is captured in [MINIFI-78](https://issues.apache.org/jira/browse/MINIFI-78). +* Using Site to Site requires the additional manual step of specifying the remote socket. This being autonegotiated through NiFi's REST API is captured in [MINIFI-70](https://issues.apache.org/jira/browse/MINIFI-70). +* The processors currently implemented include: + * TailFile + * GetFile + * GenerateFlowFile + * LogAttribute + +## System Requirements +### To build +#### Utilities +* Make +* GCC + * 4.8.4 or greater +* G++ + * 4.8.4 or greater + +#### Libraries / Development Headers +* libboost and boost-devel + * 1.23.0 or greater +* libxml2 and libxml2-devel + +### To run +#### Libraries +* libxml2 + +## Building + +From your source checkout, perform `make` in the root of the directory where the Makefile is located. For parallel building, the '-j' or '--jobs' option. On an average development machine, a serial build takes approximately 90 seconds. + + # ~/Development/code/apache/nifi-minifi-cpp on git:master + $ make + make -C thirdparty/yaml-cpp-yaml-cpp-0.5.3 + mkdir -p ./build + g++ -Os -I./include -c -o build/parse.o src/parse.cpp + mkdir -p ./build + g++ -Os -I./include -c -o build/parser.o src/parser.cpp + mkdir -p ./build + g++ -Os -I./include -c -o build/regex_yaml.o src/regex_yaml.cpp + ... + + +## Clean +Generated files and artifacts can be removed by performing a `make clean`. + + # ~/Development/code/apache/nifi-minifi-cpp on git:master + $ make clean + rm -rf ./build + rm -rf ./target + rm -rf ./assemblies + make -C thirdparty/yaml-cpp-yaml-cpp-0.5.3 clean + rm -rf ./lib ./build + make -C thirdparty/uuid clean + rm -f *.o libuuid.a + find ./ -iname "*.o" -exec rm -f {} \; + +## Configuring +The 'conf' directory in the root contains a template flow.yml document. This is compatible with the format used with the Java MiNiFi application. Currently, a subset of the configuration is supported. Additional information on the YAML format for the flow.yml can be found in the [MiNiFi System Administrator Guide](https://nifi.apache.org/minifi/system-admin-guide.html). + + Flow Controller: + name: MiNiFi Flow + + Processors: + - name: GetFile + class: org.apache.nifi.processors.standard.GetFile + max concurrent tasks: 1 + scheduling strategy: TIMER_DRIVEN + scheduling period: 1 sec + penalization period: 30 sec + yield period: 1 sec + run duration nanos: 0 + auto-terminated relationships list: + Properties: + Input Directory: /tmp/getfile + Keep Source File: true + + Connections: + - name: TransferFilesToRPG + source name: GetFile + source relationship name: success + destination name: 471deef6-2a6e-4a7d-912a-81cc17e3a204 + max work queue size: 0 + max work queue data size: 1 MB + flowfile expiration: 60 sec + + Remote Processing Groups: + - name: NiFi Flow + url: http://localhost:8080/nifi + timeout: 30 secs + yield period: 10 sec + Input Ports: + - id: 471deef6-2a6e-4a7d-912a-81cc17e3a204 + name: From Node A + max concurrent tasks: 1 + Properties: + Port: 10001 + Host Name: localhost + +## Running +After completing a [build](#building), the application can be run by issuing: + + $ ./target/minifi + +By default, this will make use of a flow.yml located in the conf directory. This configuration file location can be altered by adjusting the property `nifi.flow.configuration.file` in minifi.properties located in the conf directory. +## License Except as otherwise noted this software is licensed under the [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0.html) @@ -36,68 +161,3 @@ distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -## Dependencies - * gcc - 4.8.4 - * g++ - 4.8.4 - * [libxml2](http://xmlsoft.org/) - tested with 2.9.1 - MAC: brew install libxml2 - * [libuuid] https://sourceforge.net/projects/libuuid/ - MAC: After download the above source, configure/make/make install - -## Build instructions - -Build application, it will build minifi exe under build and copy over to target directory - - $ make - -Clean - - $ make clean - -## Running - -Running application - - $ ./target/minifi - -The Native MiNiFi example flow.xml is in target/conf -It show cases a Native MiNiFi client which can generate flowfile, log flowfile and push it to the NiFi server. -Also it can pull flowfile from NiFi server and log the flowfile. -The NiFi server config is target/conf/flow_Site2SiteServer.xml - -For trial command control protocol between Native MiNiFi and NiFi Server, please see the example NiFi Server implementation in test/Server.cpp -The command control protocol is not finalized yet. - -Caveat: -1) -Add new propery HostName and Port into RemoteProcessGroup InputOutput port for remote Site2Site hostname and port -<remoteProcessGroup> - <id>8f3b248f-d493-4269-b317-36f85719f480</id> - <name>NiFi Flow</name> - <url>http://localhost:8081/nifi</url> - <timeout>30 sec</timeout> - <yieldPeriod>1 sec</yieldPeriod> - <transmitting>true</transmitting> - <inputPort> - <id>471deef6-2a6e-4a7d-912a-81cc17e3a204</id> - <name> From Node A</name> - <position x="0.0" y="0.0"/> - <comments/> - <scheduledState>RUNNING</scheduledState> - <maxConcurrentTasks>1</maxConcurrentTasks> - <useCompression>false</useCompression> - <property> - <name>Host Name</name> - <value>localhost</value> - </property> - <property> - <name>Port</name> - <value>10001</value> - </property> - </inputPort> -2) -Add new proerties into minifi.properties for command control -# MiNiFi Server for Command Control -nifi.server.name=localhost -nifi.server.port=9000 -nifi.server.report.interval=1000 ms http://git-wip-us.apache.org/repos/asf/nifi-minifi-cpp/blob/4d4fd963/conf/minifi.properties ---------------------------------------------------------------------- diff --git a/conf/minifi.properties b/conf/minifi.properties index 854b84a..c6859b8 100644 --- a/conf/minifi.properties +++ b/conf/minifi.properties @@ -14,176 +14,19 @@ # limitations under the License. # Core Properties # -nifi.version=0.6.0-SNAPSHOT +nifi.version=0.0.1 nifi.flow.configuration.file=./conf/flow.yml -nifi.flow.configuration.archive.dir=./conf/archive/ -nifi.flowcontroller.autoResumeState=true nifi.flowcontroller.graceful.shutdown.period=10 sec nifi.flowservice.writedelay.interval=500 ms nifi.administrative.yield.duration=30 sec # If a component has no work to do (is "bored"), how long should we wait before checking again for work? nifi.bored.yield.duration=10 millis -nifi.authority.provider.configuration.file=./conf/authority-providers.xml -nifi.login.identity.provider.configuration.file=./conf/login-identity-providers.xml -nifi.templates.directory=./conf/templates -nifi.ui.banner.text= -nifi.ui.autorefresh.interval=30 sec -nifi.nar.library.directory=./lib -nifi.nar.working.directory=./work/nar/ -nifi.documentation.working.directory=./work/docs/components - -#################### -# State Management # -#################### -nifi.state.management.configuration.file=./conf/state-management.xml -# The ID of the local state provider -nifi.state.management.provider.local=local-provider -# The ID of the cluster-wide state provider. This will be ignored if NiFi is not clustered but must be populated if running in a cluster. -nifi.state.management.provider.cluster=zk-provider -# Specifies whether or not this instance of NiFi should run an embedded ZooKeeper server -nifi.state.management.embedded.zookeeper.start=false -# Properties file that provides the ZooKeeper properties to use if <nifi.state.management.embedded.zookeeper.start> is set to true -nifi.state.management.embedded.zookeeper.properties=./conf/zookeeper.properties - - -# H2 Settings -nifi.database.directory=./database_repository -nifi.h2.url.append=;LOCK_TIMEOUT=25000;WRITE_DELAY=0;AUTO_SERVER=FALSE - -# FlowFile Repository -nifi.flowfile.repository.implementation=org.apache.nifi.controller.repository.WriteAheadFlowFileRepository -nifi.flowfile.repository.directory=./flowfile_repository -nifi.flowfile.repository.partitions=256 -nifi.flowfile.repository.checkpoint.interval=2 mins -nifi.flowfile.repository.always.sync=false - -nifi.swap.manager.implementation=org.apache.nifi.controller.FileSystemSwapManager -nifi.queue.swap.threshold=20000 -nifi.swap.in.period=5 sec -nifi.swap.in.threads=1 -nifi.swap.out.period=5 sec -nifi.swap.out.threads=4 - -# Content Repository -nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository -nifi.content.claim.max.appendable.size=10 MB -nifi.content.claim.max.flow.files=100 -nifi.content.repository.directory.default=./content_repository -nifi.content.repository.archive.max.retention.period=12 hours -nifi.content.repository.archive.max.usage.percentage=50% -nifi.content.repository.archive.enabled=true -nifi.content.repository.always.sync=false -nifi.content.viewer.url=/nifi-content-viewer/ - -# Provenance Repository Properties -nifi.provenance.repository.implementation=org.apache.nifi.provenance.PersistentProvenanceRepository - -# Persistent Provenance Repository Properties -nifi.provenance.repository.directory.default=./provenance_repository -nifi.provenance.repository.max.storage.time=24 hours -nifi.provenance.repository.max.storage.size=1 GB -nifi.provenance.repository.rollover.time=30 secs -nifi.provenance.repository.rollover.size=100 MB -nifi.provenance.repository.query.threads=2 -nifi.provenance.repository.index.threads=1 -nifi.provenance.repository.compress.on.rollover=true -nifi.provenance.repository.always.sync=false -nifi.provenance.repository.journal.count=16 -# Comma-separated list of fields. Fields that are not indexed will not be searchable. Valid fields are: -# EventType, FlowFileUUID, Filename, TransitURI, ProcessorID, AlternateIdentifierURI, Relationship, Details -nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, ProcessorID, Relationship -# FlowFile Attributes that should be indexed and made searchable. Some examples to consider are filename, uuid, mime.type -nifi.provenance.repository.indexed.attributes= -# Large values for the shard size will result in more Java heap usage when searching the Provenance Repository -# but should provide better performance -nifi.provenance.repository.index.shard.size=500 MB -# Indicates the maximum length that a FlowFile attribute can be when retrieving a Provenance Event from -# the repository. If the length of any attribute exceeds this value, it will be truncated when the event is retrieved. -nifi.provenance.repository.max.attribute.length=65536 - -# Volatile Provenance Respository Properties -nifi.provenance.repository.buffer.size=100000 - -# Component Status Repository -nifi.components.status.repository.implementation=org.apache.nifi.controller.status.history.VolatileComponentStatusRepository -nifi.components.status.repository.buffer.size=1440 -nifi.components.status.snapshot.frequency=1 min - # Site to Site properties nifi.remote.input.socket.host=localhost nifi.remote.input.socket.port=10000 nifi.remote.input.secure=false -# web properties # -nifi.web.war.directory=./lib -nifi.web.http.host= -nifi.web.http.port=8080 -nifi.web.https.host= -nifi.web.https.port= -nifi.web.jetty.working.directory=./work/jetty -nifi.web.jetty.threads=200 - -# security properties # -nifi.sensitive.props.key= -nifi.sensitive.props.algorithm=PBEWITHMD5AND256BITAES-CBC-OPENSSL -nifi.sensitive.props.provider=BC - -nifi.security.keystore= -nifi.security.keystoreType= -nifi.security.keystorePasswd= -nifi.security.keyPasswd= -nifi.security.truststore= -nifi.security.truststoreType= -nifi.security.truststorePasswd= -nifi.security.needClientAuth= -nifi.security.user.credential.cache.duration=24 hours -nifi.security.user.authority.provider=file-provider -nifi.security.user.login.identity.provider= -nifi.security.support.new.account.requests= -# Valid Authorities include: ROLE_MONITOR,ROLE_DFM,ROLE_ADMIN,ROLE_PROVENANCE,ROLE_NIFI -nifi.security.anonymous.authorities= -nifi.security.ocsp.responder.url= -nifi.security.ocsp.responder.certificate= - -# cluster common properties (cluster manager and nodes must have same values) # -nifi.cluster.protocol.heartbeat.interval=5 sec -nifi.cluster.protocol.is.secure=false -nifi.cluster.protocol.socket.timeout=30 sec -nifi.cluster.protocol.connection.handshake.timeout=45 sec -# if multicast is used, then nifi.cluster.protocol.multicast.xxx properties must be configured # -nifi.cluster.protocol.use.multicast=false -nifi.cluster.protocol.multicast.address= -nifi.cluster.protocol.multicast.port= -nifi.cluster.protocol.multicast.service.broadcast.delay=500 ms -nifi.cluster.protocol.multicast.service.locator.attempts=3 -nifi.cluster.protocol.multicast.service.locator.attempts.delay=1 sec - -# cluster node properties (only configure for cluster nodes) # -nifi.cluster.is.node=false -nifi.cluster.node.address= -nifi.cluster.node.protocol.port= -nifi.cluster.node.protocol.threads=2 -# if multicast is not used, nifi.cluster.node.unicast.xxx must have same values as nifi.cluster.manager.xxx # -nifi.cluster.node.unicast.manager.address= -nifi.cluster.node.unicast.manager.protocol.port= - -# cluster manager properties (only configure for cluster manager) # -nifi.cluster.is.manager=false -nifi.cluster.manager.address= -nifi.cluster.manager.protocol.port= -nifi.cluster.manager.node.firewall.file= -nifi.cluster.manager.node.event.history.size=10 -nifi.cluster.manager.node.api.connection.timeout=30 sec -nifi.cluster.manager.node.api.read.timeout=30 sec -nifi.cluster.manager.node.api.request.threads=10 -nifi.cluster.manager.flow.retrieval.delay=5 sec -nifi.cluster.manager.protocol.threads=10 -nifi.cluster.manager.safemode.duration=0 sec - -# kerberos # -nifi.kerberos.krb5.file= - # MiNiFi Server for Command Control nifi.server.name=localhost nifi.server.port=9000
