Hello David, As I mentioned on PARQUET-1758, we have been frustrated by overly verbose logging in Parquet for a long time. Various workarounds have been more or less successful, e.g.
https://github.com/bigdatagenomics/adam/issues/851 <https://github.com/bigdatagenomics/adam/issues/851> I would support a move making Parquet a silent partner. :) michael > On Jan 23, 2020, at 10:25 AM, David Mollitor <[email protected]> wrote: > > Hello Team, > > I have been a consumer of Apache Parquet through Apache Hive for several > years now. For a long time, logging in Parquet has been pretty painful. > Some of the logging was going to STDOUT and some was going to Log4J. > Overall, though the framework has been too verbose, spewing many log lines > about internal details of Parquet I don't understand. > > The logging has gotten a lot better with recent releases moving solidly > into SLF4J. That is awesome and very welcomed. However, (opinion alert) I > think the logging is still too verbose. I think Parquet should be a silent > partner in data processing. If everything is going well, it should be > silent (DEBUG level logging). If things are going wrong, it should throw > an Exception. > > If an operator suspects Parquet is the issue (and that's rarely the first > thing to check), they can set the logging for all of the Loggers in the > entire Parquet package (org.apache.parquet) to DEBUG to get the required > information. Not to mention, the less logging it does, the faster it will > be. > > I've opened this discussion because I've got two PRs related to this topic > ready to go: > > PARQUET-1758 > PARQUET-1761 > > Thanks, > David
