[
https://issues.apache.org/jira/browse/NIFI-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991012#comment-15991012
]
ASF GitHub Bot commented on NIFI-3724:
--------------------------------------
Github user alopresto commented on a diff in the pull request:
https://github.com/apache/nifi/pull/1712#discussion_r114147981
--- Diff:
nifi-nar-bundles/nifi-extension-utils/nifi-hadoop-utils/src/main/java/org/apache/nifi/processors/hadoop/AbstractHadoopProcessor.java
---
@@ -67,40 +61,14 @@
*/
@RequiresInstanceClassLoading(cloneAncestorResources = true)
public abstract class AbstractHadoopProcessor extends AbstractProcessor {
- /**
- * Compression Type Enum
- */
- public enum CompressionType {
- NONE,
- DEFAULT,
- BZIP,
- GZIP,
- LZ4,
- SNAPPY,
- AUTOMATIC;
-
- @Override
- public String toString() {
- switch (this) {
- case NONE: return "NONE";
- case DEFAULT: return DefaultCodec.class.getName();
- case BZIP: return BZip2Codec.class.getName();
- case GZIP: return GzipCodec.class.getName();
- case LZ4: return Lz4Codec.class.getName();
- case SNAPPY: return SnappyCodec.class.getName();
- case AUTOMATIC: return "Automatically Detected";
- }
- return null;
- }
- }
// properties
public static final PropertyDescriptor HADOOP_CONFIGURATION_RESOURCES
= new PropertyDescriptor.Builder()
.name("Hadoop Configuration Resources")
.description("A file or comma separated list of files which
contains the Hadoop file system configuration. Without this, Hadoop "
+ "will search the classpath for a 'core-site.xml' and
'hdfs-site.xml' file or will revert to a default configuration.")
.required(false)
- .addValidator(createMultipleFilesExistValidator())
+ .addValidator(HadoopValidators.MULTIPLE_FILE_EXISTS_VALIDATOR)
--- End diff --
Minor comment -- until I read the source code for this, my interpretation
was that this validator ensured that *multiple files existed* -- i.e. one file
provided would fail. Perhaps we can rename this
`ONE_OR_MORE_FILES_EXIST_VALIDATOR`? Not a giant issue but potentially
confusing.
> Add Put/Fetch Parquet Processors
> --------------------------------
>
> Key: NIFI-3724
> URL: https://issues.apache.org/jira/browse/NIFI-3724
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Bryan Bende
> Assignee: Bryan Bende
> Priority: Minor
> Fix For: 1.2.0
>
>
> Now that we have the record reader/writer services currently in master, it
> would be nice to have reader and writers for Parquet. Since Parquet's API is
> based on the Hadoop Path object, and not InputStreams/OutputStreams, we can't
> really implement direct conversions to and from Parquet in the middle of a
> flow, but we can we can perform the conversion by taking any record format
> and writing to a Path as Parquet, or reading Parquet from a Path and writing
> it out as another record format.
> We should add a PutParquet that uses a record reader and writes records to a
> Path as Parquet, and a FetchParquet that reads Parquet from a path and writes
> out records to a flow file using a record writer.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)