[
https://issues.apache.org/jira/browse/NIFI-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16481380#comment-16481380
]
ASF GitHub Bot commented on NIFI-4165:
--------------------------------------
Github user alopresto commented on the issue:
https://github.com/apache/nifi/pull/2502
@markap14 sorry I got distracted from this review. I have revisited it and
I have some points I'd like to discuss:
* I rebased against `master`, as there have obviously been some changes
there. These fall into a couple places:
** the version bump to `1.7.0-SNAPSHOT` in the `pom.xml` for both this
artifact and a dependency
** there have been changes to `FlowFileQueue` which `DummyFlowFileQueue`
must implement
* I added some logic to `RemoveFlowFilesWithMissingContent` which loads the
*master key* from the expected `bootstrap.conf` file in order to handle a
`nifi.properties` file with encrypted configuration values.
* The other NiFi Toolkit components have a `*.bat`/`*.sh` script which
allows them to be run. This provides a couple features:
** named command-line arguments as opposed to positional arguments
** Setting up `$JAVA_HOME` and the classpath rather than calling `java`
directly on the command-line
* The `jar-with-dependencies` in `maven-assembly-plugin` only seems to run
when you use `mvn clean compile assembly:single` rather than being tied to the
`install` phase via a profile (see [Stack
Overflow](https://stackoverflow.com/a/574650/70465)). Please let me know if I'm
missing something here
I ran the scenario you suggested by generating some flowfiles into a queue
and then removing the `content_repository` directory contents. When I did that,
I got this message:
```
hw12203:/Users/alopresto/Workspace/nifi/nifi-toolkit/nifi-toolkit-flowfile-repo
(pr2502) alopresto
🔓 149s @ 17:17:25 $ cd target/
hw12203:...ers/alopresto/Workspace/nifi/nifi-toolkit/nifi-toolkit-flowfile-repo/target
(pr2502) alopresto
🔓 0s @ 17:17:31 $ java -cp
nifi-toolkit-flowfile-repo-1.7.0-SNAPSHOT-jar-with-dependencies.jar:../../nifi-toolkit-assembly/target/nifi-toolkit-1.7.0-SNAPSHOT-bin/nifi-toolkit-1.7.0-SNAPSHOT/lib/slf4j-api-1.7.25.jar
org.apache.nifi.toolkit.repos.flowfile.RemoveFlowFilesWithMissingContent
~/Workspace/nifi/nifi-assembly/target/nifi-1.7.0-SNAPSHOT-bin/nifi-1.7.0-SNAPSHOT/conf/nifi.properties
~/Workspace/nifi/nifi-assembly/target/nifi-1.7.0-SNAPSHOT-bin/nifi-1.7.0-SNAPSHOT/flowfile_repository/
17:17:35.865 [main] INFO org.apache.nifi.properties.NiFiPropertiesLoader -
Loaded 148 properties from
/Users/alopresto/Workspace/nifi/nifi-assembly/target/nifi-1.7.0-SNAPSHOT-bin/nifi-1.7.0-SNAPSHOT/conf/nifi.properties
17:17:35.872 [main] DEBUG
org.apache.nifi.properties.ProtectedNiFiProperties - Loaded 148 properties
(including 0 protection schemes) into ProtectedNiFiProperties
17:17:35.872 [main] DEBUG
org.apache.nifi.properties.ProtectedNiFiProperties - No protected properties
Cannot find or cannot read ./content_repository or it is not a directory
hw12203:...ers/alopresto/Workspace/nifi/nifi-toolkit/nifi-toolkit-flowfile-repo/target
(pr2502) alopresto
🔓 0s @ 17:17:36 $
```
The directory definitely exists:
```
hw12203:...space/nifi/nifi-assembly/target/nifi-1.7.0-SNAPSHOT-bin/nifi-1.7.0-SNAPSHOT
(pr2502) alopresto
🔓 0s @ 17:17:48 $ ll
total 416
drwxr-xr-x 17 alopresto staff 578B May 10 16:40 ./
drwxr-xr-x 3 alopresto staff 102B May 10 10:20 ../
-rw-r--r-- 1 alopresto staff 119K Mar 13 17:25 LICENSE
-rw-r--r-- 1 alopresto staff 80K May 10 09:23 NOTICE
-rw-r--r-- 1 alopresto staff 4.4K Dec 13 15:56 README
drwxr-xr-x 8 alopresto staff 272B May 10 10:20 bin/
drwxr-xr-x 12 alopresto staff 408B May 18 16:51 conf/
drwxr-xr-x 2 alopresto staff 68B May 18 16:51 content_repository/
drwxr-xr-x 6 alopresto staff 204B May 18 16:50 database_repository/
drwxr-xr-x 3 alopresto staff 102B May 10 10:20 docs/
drwxr-xr-x 5 alopresto staff 170B May 18 16:52 flowfile_repository/
drwxr-xr-x 113 alopresto staff 3.8K May 10 10:20 lib/
drwxr-xr-x 10 alopresto staff 340B May 18 17:00 logs/
drwxr-xr-x 9 alopresto staff 306B May 18 16:51 provenance_repository/
drwxr-xr-x 4 alopresto staff 136B May 18 16:49 run/
drwxr-xr-x 3 alopresto staff 102B May 10 16:40 state/
drwxr-xr-x 5 alopresto staff 170B May 18 16:50 work/
hw12203:...space/nifi/nifi-assembly/target/nifi-1.7.0-SNAPSHOT-bin/nifi-1.7.0-SNAPSHOT
(pr2502) alopresto
🔓 0s @ 17:19:35 $ ll content_repository/
total 0
drwxr-xr-x 2 alopresto staff 68B May 18 16:51 ./
drwxr-xr-x 17 alopresto staff 578B May 10 16:40 ../
```
I believe this is because in the default `nifi.properties` file, the
content repository is defined as a relative path `./content_repository`, so I
think there should be code in the tool to resolve this path if it is not
absolute.
Let me know what you think about those comments. I pushed my changes [to a
branch
NIFI-4165](https://github.com/alopresto/nifi/commit/579da9117d03ad9cc24499bbad6d27fae7c92037).
> Update NiFi FlowFile Repository Toolkit to provide ability to remove
> FlowFiles whose content is missing
> -------------------------------------------------------------------------------------------------------
>
> Key: NIFI-4165
> URL: https://issues.apache.org/jira/browse/NIFI-4165
> Project: Apache NiFi
> Issue Type: New Feature
> Components: Tools and Build
> Reporter: Mark Payne
> Assignee: Mark Payne
> Priority: Major
>
> The FlowFile Repo toolkit has the ability to address issues with flowfile
> repo corruption due to sudden power loss. Another problem that has been known
> to occur is if content goes missing from the content repository for whatever
> reason (say some process deletes some of the files) then the FlowFile Repo
> can contain a lot of FlowFiles whose content is missing. This causes a lot of
> problems with stack traces being dumped to logs and the flow taking a really
> long time to get back to normal. We should update the toolkit to provide a
> mechanism for pointing to a FlowFile Repo and Content Repo, then writing out
> a new FlowFile Repo that removes any FlowFile whose content is missing.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)