[jira] [Commented] (NIFI-1280) Create FilterCSVColumns Processor

ASF GitHub Bot (JIRA) Wed, 25 May 2016 07:10:07 -0700

    [ 
https://issues.apache.org/jira/browse/NIFI-1280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300098#comment-15300098
 ]


ASF GitHub Bot commented on NIFI-1280:
--------------------------------------

Github user joewitt commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/420#discussion_r64578812
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/pom.xml ---
    @@ -235,6 +235,11 @@ language governing permissions and limitations under 
the License. -->
                 <artifactId>json-utils</artifactId>
                 <version>0.0.20</version>
             </dependency>
    +        <dependency>
    +            <groupId>org.apache.calcite</groupId>
    +            <artifactId>calcite-example-csv</artifactId>
    --- End diff --
    
    "We don't want multiple CSV libraries, do we?"
    I agree it is ideal to avoid multiple libraries that 'do the same thing'.
    
    However...
    
    What we really favor is much more than just that.  Bringing in something 
like Calcite's libraries means we're adopting their design decisions which they 
made out of good faith and effort just as we do with internally developed 
things.  These cases cause natural tension on transitive dependencies.  This is 
a great example of why we've provided the classloader isolation model we have 
in NiFi.  https://en.wiktionary.org/wiki/let_a_thousand_flowers_bloom


> Create FilterCSVColumns Processor
> ---------------------------------
>
>                 Key: NIFI-1280
>                 URL: https://issues.apache.org/jira/browse/NIFI-1280
>             Project: Apache NiFi
>          Issue Type: Task
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Toivo Adams
>
> We should have a Processor that allows users to easily filter out specific 
> columns from CSV data. For instance, a user would configure two different 
> properties: "Columns of Interest" (a comma-separated list of column indexes) 
> and "Filtering Strategy" (Keep Only These Columns, Remove Only These Columns).
> We can do this today with ReplaceText, but it is far more difficult than it 
> would be with this Processor, as the user has to use Regular Expressions, 
> etc. with ReplaceText.
> Eventually a Custom UI could even be built that allows a user to upload a 
> Sample CSV and choose which columns from there, similar to the way that Excel 
> works when importing CSV by dragging and selecting the desired columns? That 
> would certainly be a larger undertaking and would not need to be done for an 
> initial implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (NIFI-1280) Create FilterCSVColumns Processor

Reply via email to