Github user ijokarumawak commented on a diff in the pull request:
https://github.com/apache/nifi/pull/2351#discussion_r158616024
--- Diff:
nifi-nar-bundles/nifi-extension-utils/nifi-reporting-utils/src/main/java/org/apache/nifi/reporting/util/provenance/ProvenanceEventConsumer.java
---
@@ -218,18 +230,35 @@ private boolean isFilteringEnabled() {
return componentTypeRegex != null || !eventTypes.isEmpty() ||
!componentIds.isEmpty();
}
- private List<ProvenanceEventRecord>
filterEvents(List<ProvenanceEventRecord> provenanceEvents) {
- if(isFilteringEnabled()) {
- List<ProvenanceEventRecord> filteredEvents = new
ArrayList<ProvenanceEventRecord>();
+ private List<ProvenanceEventRecord> filterEvents(ComponentMapHolder
componentMapHolder, List<ProvenanceEventRecord> provenanceEvents) {
+ if (isFilteringEnabled()) {
+ List<ProvenanceEventRecord> filteredEvents = new ArrayList<>();
for (ProvenanceEventRecord provenanceEventRecord :
provenanceEvents) {
- if(!componentIds.isEmpty() &&
!componentIds.contains(provenanceEventRecord.getComponentId())) {
- continue;
+ final String componentId =
provenanceEventRecord.getComponentId();
+ if (!componentIds.isEmpty() &&
!componentIds.contains(componentId)) {
+ // If we aren't filtering it out based on component
ID, let's see if this component has a parent process group IDs
+ // that is being filtered on
+ if (componentMapHolder == null) {
+ continue;
+ }
+ final String processGroupId =
componentMapHolder.getProcessGroupId(componentId,
provenanceEventRecord.getComponentType());
+ if (StringUtils.isEmpty(processGroupId)) {
+ continue;
+ }
+ // Check if any parent process group has the specified
component ID
+ ParentProcessGroupSearchNode matchedComponent =
componentMapHolder.getProcessGroupParent(componentId);
+ while (matchedComponent != null &&
!matchedComponent.getId().equals(processGroupId) &&
!componentIds.contains(matchedComponent.getId())) {
--- End diff --
The condition `!matchedComponent.getId().equals(processGroupId)` should be
removed.
It does not work if a ProcessGroup id is used for filtering. For example,
if there are Root, PG1, PG2, and Component C1 is in PG1, then the reporting
task is configured to filter with PG2. In that case, `processGroupId` would be
PG1. But it's not specified in `componentIds`. Since `componentIds` only
contains PG2, C1 in PG1 should be filtered out. But the condition make C1 to
pass.
---