Hello David,

As I mentioned on PARQUET-1758, we have been frustrated by overly verbose 
logging in Parquet for a long time.  Various workarounds have been more or less 
successful, e.g.

https://github.com/bigdatagenomics/adam/issues/851 
<https://github.com/bigdatagenomics/adam/issues/851>

I would support a move making Parquet a silent partner.  :)

   michael


> On Jan 23, 2020, at 10:25 AM, David Mollitor <[email protected]> wrote:
> 
> Hello Team,
> 
> I have been a consumer of Apache Parquet through Apache Hive for several
> years now.  For a long time, logging in Parquet has been pretty painful.
> Some of the logging was going to STDOUT and some was going to Log4J.
> Overall, though the framework has been too verbose, spewing many log lines
> about internal details of Parquet I don't understand.
> 
> The logging has gotten a lot better with recent releases moving solidly
> into SLF4J.  That is awesome and very welcomed.  However, (opinion alert) I
> think the logging is still too verbose.  I think Parquet should be a silent
> partner in data processing.  If everything is going well, it should be
> silent (DEBUG level logging).  If things are going wrong, it should throw
> an Exception.
> 
> If an operator suspects Parquet is the issue (and that's rarely the first
> thing to check), they can set the logging for all of the Loggers in the
> entire Parquet package (org.apache.parquet) to DEBUG to get the required
> information.  Not to mention, the less logging it does, the faster it will
> be.
> 
> I've opened this discussion because I've got two PRs related to this topic
> ready to go:
> 
> PARQUET-1758
> PARQUET-1761
> 
> Thanks,
> David

Reply via email to