Colin Dean created NIFI-5591:
--------------------------------
Summary: Enable compression of Avro in ExecuteSQL
Key: NIFI-5591
URL: https://issues.apache.org/jira/browse/NIFI-5591
Project: Apache NiFi
Issue Type: Improvement
Components: Core Framework
Affects Versions: 1.7.1
Environment: macOS, Java 8
Reporter: Colin Dean
The Avro stream that comes out of the ExecuteSQL processor is uncompressed.
It's possible to rewrite it compressed using a combination of ConvertRecord
processor with AvroReader and AvroRecordSetWriter, but that's a lot of extra
I/O that could be handled transparently at the moment that the Avro data is
created.
For implementation, it looks like ExecuteSQL builds a set of
{{JdbcCommon.AvroConvertionOptions}}[here|https://github.com/apache/nifi/blob/ea9b0db2f620526c8dd0db595cf8b44c3ef835be/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExecuteSQL.java#L246].
That options object would need to gain a compression flag. Then, within
{{JdbcCommon#convertToAvroStream}}
[here|https://github.com/apache/nifi/blob/0dd4a91a6741eec04965a260c8aff38b72b3828d/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/util/JdbcCommon.java#L281],
the {{dataFileWriter}} would get a codec set by {{setCodec}}, with the codec
having been created shortly before.
For example of creating the codec, I looked at how the AvroRecordSetWriter does
it. The {{setCodec()}} is performed
[here|https://github.com/apache/nifi/blob/ea9b0db2f620526c8dd0db595cf8b44c3ef835be/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/avro/WriteAvroResultWithSchema.java#L44]
after the codec is created by factory option
[here|https://github.com/apache/nifi/blob/ea9b0db2f620526c8dd0db595cf8b44c3ef835be/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/avro/AvroRecordSetWriter.java#L104]
using a factory method
[here|https://github.com/apache/nifi/blob/ea9b0db2f620526c8dd0db595cf8b44c3ef835be/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/avro/AvroRecordSetWriter.java#L137].
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
