Hi, Recently there was FSFileSplitter added to Malhar library. I have created https://issues.apache.org/jira/browse/APEXMALHAR-2081 to remove this operator and adds its functionality to the FileSplitterInput.
The reason to do so is because this extension just adds 3 trivial features which makes it difficult for the user to know which operator to use. It adds more classes which essentially do the same thing. This operator add 3 properties to FileSplitterInput. 1. ignoreFilePatternRegularExp: regular expression that specifies which files to ignore. This is useful to have in the FileSplitterInput. 2. unsupportedChar: first of all this is a String. File having this String will be ignored. IMO this is redundant. #1 can be used to accomplish this. I think this should be removed. 3. sequentialFileReader: when this property is set, the block metadata of the same files have the same hashcode. This I think may have been done so that all the block metadata of a particular file go to the same block reader. IMO this is a hacky way of accomplishing this. If an application needs this then this should have been done using a StreamCodec. I think this should be removed. Thanks, Chandni
