[ 
https://issues.apache.org/jira/browse/METRON-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643594#comment-16643594
 ] 

ASF GitHub Bot commented on METRON-1809:
----------------------------------------

Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1229#discussion_r223743370
  
    --- Diff: metron-analytics/metron-profiler-spark/README.md ---
    @@ -265,6 +290,18 @@ The path to the input data read by the Batch Profiler.
     
     The format of the input data read by the Batch Profiler.
     
    +### `profiler.batch.input.reader`
    --- End diff --
    
    It is a valid option.  The only reason I did not do that is that we would 
have to specifically support each format like JSON, CSV, ORC, Parquet.  Whereas 
with these two switches, via configuration alone, a user can use a variety of 
formats without us having to specifically support each one.
    
    Now that being said, I don't know how useful that is to the user 
population.  How many formats will users want to consume?  How useful is that 
flexibility? 
    
    At this point, since this is new functionality, I decided to err on the 
side of greater flexibility over simplicity. Knowing that reasoning, let me 
know if you still think we should go for simplicity over flexibility.



> Support Column Oriented Input with Batch Profiler
> -------------------------------------------------
>
>                 Key: METRON-1809
>                 URL: https://issues.apache.org/jira/browse/METRON-1809
>             Project: Metron
>          Issue Type: Bug
>            Reporter: Nick Allen
>            Assignee: Nick Allen
>            Priority: Major
>
> The Batch Profiler currently only accepts input formats that can be directly 
> serialized to JSON.  This should be enhanced to accept a wider variety of 
> input formats.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to