Github user justinleet commented on a diff in the pull request:
https://github.com/apache/metron/pull/1229#discussion_r223724471
--- Diff: metron-analytics/metron-profiler-spark/README.md ---
@@ -265,6 +290,18 @@ The path to the input data read by the Batch Profiler.
The format of the input data read by the Batch Profiler.
+### `profiler.batch.input.reader`
--- End diff --
This feels somewhat unnecessary as a parameter.
Why could we just keep `profiler.batch.input.format` and then drop this?
We can determine reader based on the format (COLUMNAR if ORC/Parquet, TEXT
else). If we add other formats in the future, we'd still know right which
reader to pull right?
---