[
https://issues.apache.org/jira/browse/NIFI-4707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16303032#comment-16303032
]
ASF GitHub Bot commented on NIFI-4707:
--------------------------------------
Github user ijokarumawak commented on a diff in the pull request:
https://github.com/apache/nifi/pull/2351#discussion_r158616024
--- Diff:
nifi-nar-bundles/nifi-extension-utils/nifi-reporting-utils/src/main/java/org/apache/nifi/reporting/util/provenance/ProvenanceEventConsumer.java
---
@@ -218,18 +230,35 @@ private boolean isFilteringEnabled() {
return componentTypeRegex != null || !eventTypes.isEmpty() ||
!componentIds.isEmpty();
}
- private List<ProvenanceEventRecord>
filterEvents(List<ProvenanceEventRecord> provenanceEvents) {
- if(isFilteringEnabled()) {
- List<ProvenanceEventRecord> filteredEvents = new
ArrayList<ProvenanceEventRecord>();
+ private List<ProvenanceEventRecord> filterEvents(ComponentMapHolder
componentMapHolder, List<ProvenanceEventRecord> provenanceEvents) {
+ if (isFilteringEnabled()) {
+ List<ProvenanceEventRecord> filteredEvents = new ArrayList<>();
for (ProvenanceEventRecord provenanceEventRecord :
provenanceEvents) {
- if(!componentIds.isEmpty() &&
!componentIds.contains(provenanceEventRecord.getComponentId())) {
- continue;
+ final String componentId =
provenanceEventRecord.getComponentId();
+ if (!componentIds.isEmpty() &&
!componentIds.contains(componentId)) {
+ // If we aren't filtering it out based on component
ID, let's see if this component has a parent process group IDs
+ // that is being filtered on
+ if (componentMapHolder == null) {
+ continue;
+ }
+ final String processGroupId =
componentMapHolder.getProcessGroupId(componentId,
provenanceEventRecord.getComponentType());
+ if (StringUtils.isEmpty(processGroupId)) {
+ continue;
+ }
+ // Check if any parent process group has the specified
component ID
+ ParentProcessGroupSearchNode matchedComponent =
componentMapHolder.getProcessGroupParent(componentId);
+ while (matchedComponent != null &&
!matchedComponent.getId().equals(processGroupId) &&
!componentIds.contains(matchedComponent.getId())) {
--- End diff --
The condition `!matchedComponent.getId().equals(processGroupId)` should be
removed.
It does not work if a ProcessGroup id is used for filtering. For example,
if there are Root, PG1, PG2, and Component C1 is in PG1, then the reporting
task is configured to filter with PG2. In that case, `processGroupId` would be
PG1. But it's not specified in `componentIds`. Since `componentIds` only
contains PG2, C1 in PG1 should be filtered out. But the condition make C1 to
pass.
> SiteToSiteProvenanceReportingTask not returning correct metadata
> ----------------------------------------------------------------
>
> Key: NIFI-4707
> URL: https://issues.apache.org/jira/browse/NIFI-4707
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Reporter: Matt Burgess
> Assignee: Matt Burgess
>
> When the SiteToSiteProvenanceReportingTask emits flow files, some of them
> include a "componentName" field and some do not. Investigation shows that
> only the components (except connections) in the root process group have that
> field populated. Having this information can be very helpful to the user,
> even though the names might be duplicated, there would be a mapping between a
> component's ID and its name. At the very least the behavior (i.e. component
> name being available) should be consistent.
> Having a full map (by traversing the entire flow) also opens up the ability
> to include Process Group information for the various components. The
> reporting task could include the parent Process Group identifier and/or name,
> with perhaps a special ID for the root PG's "parent", such as "@ROOT@" or
> something unique.
> This could also allow for a PG ID in the list of filtered "component IDs",
> where any provenance event for a processor in a particular PG could be
> included in a filter when that PG's ID is in the filter list.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)