Github user markap14 commented on a diff in the pull request:
https://github.com/apache/nifi/pull/1877#discussion_r135305805
--- Diff:
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-service-api/src/main/java/org/apache/nifi/serialization/RecordSetWriterFactory.java
---
@@ -45,20 +51,23 @@
*/
public interface RecordSetWriterFactory extends ControllerService {
+ InputStream EMPTY_INPUT_STREAM = new ByteArrayInputStream(new byte[0]);
+
/**
* <p>
- * Returns the Schema that will be used for writing Records. Note that
the FlowFile and InputStream that are given
- * may well be different than the FlowFile that the writer will write
to. The given FlowFile and InputStream are
+ * Returns the Schema that will be used for writing Records. Note that
the InputStream that are given
+ * may well be different than the content that the writer will write.
The given variables and InputStream are
* intended to be used for determining the schema that should be used
when writing records.
* </p>
*
- * @param flowFile the FlowFile from which the schema should be
determined.
+ * @param variables the variables which is used to resolve Record
Schema via Expression Language, can be null or empty
+ * @param content the contents of the input data from which to
determine the schema
* @param readSchema the schema that was read from the incoming
FlowFile, or <code>null</code> if there is no input schema
*
* @return the Schema that should be used for writing Records
* @throws SchemaNotFoundException if unable to find the schema
*/
- RecordSchema getSchema(FlowFile flowFile, RecordSchema readSchema)
throws SchemaNotFoundException, IOException;
+ RecordSchema getSchema(Map<String, String> variables, InputStream
content, RecordSchema readSchema) throws SchemaNotFoundException, IOException;
--- End diff --
@ijokarumawak can you explain the reasoning here for providing a
Map<String, String> and an InputStream? Previously, with just the FlowFile, the
Writer had access only to the attributes, not the content (because it had no
Process Session). I believe that is the correct abstraction. The RecordSchema
from the reader already is passed in, and the FlowFile being written to will
have no content, generally, so reading from the destination FlowFile wouldn't
make sense. It's possible that I'm just misunderstanding the idea here, though.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---