Github user olegz commented on a diff in the pull request:
https://github.com/apache/nifi/pull/1115#discussion_r84719151
--- Diff:
nifi-commons/nifi-processor-utilities/src/main/java/org/apache/nifi/processor/util/bin/BinFiles.java
---
@@ -273,25 +262,26 @@ private int binFlowFiles(final ProcessContext
context, final ProcessSessionFacto
}
final ProcessSession session = sessionFactory.createSession();
- FlowFile flowFile = session.get();
- if (flowFile == null) {
+ final List<FlowFile> flowFiles = session.get(1000);
+ if (flowFiles.isEmpty()) {
break;
}
- flowFile = this.preprocessFlowFile(context, session, flowFile);
-
- String groupId = this.getGroupId(context, flowFile);
-
- final boolean binned = binManager.offer(groupId, flowFile,
session);
-
- // could not be added to a bin -- probably too large by
itself, so create a separate bin for just this guy.
- if (!binned) {
- Bin bin = new Bin(0, Long.MAX_VALUE, 0, Integer.MAX_VALUE,
null);
- bin.offer(flowFile, session);
- this.readyBins.add(bin);
+ final Map<String, List<FlowFile>> flowFileGroups = new
HashMap<>();
+ for (FlowFile flowFile : flowFiles) {
+ flowFile = this.preprocessFlowFile(context, session,
flowFile);
+ final String groupingIdentifier = getGroupId(context,
flowFile);
+ flowFileGroups.computeIfAbsent(groupingIdentifier, id ->
new ArrayList<>()).add(flowFile);
}
- flowFilesBinned++;
+ for (final Map.Entry<String, List<FlowFile>> entry :
flowFileGroups.entrySet()) {
+ final Set<FlowFile> unbinned =
binManager.offer(entry.getKey(), entry.getValue(), session, sessionFactory);
+ for (final FlowFile flowFile : unbinned) {
+ Bin bin = new Bin(session, 0, Long.MAX_VALUE, 0,
Integer.MAX_VALUE, null);
+ bin.offer(flowFile, session);
+ this.readyBins.add(bin);
+ }
+ }
--- End diff --
After looking at ```BinManager.offer(..)``` I am not sure I understand
what's happening in inner loop above. Arn't you essentially doing the same
thing in ```BinManager:201```?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---