[
https://issues.apache.org/jira/browse/PARQUET-401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15057800#comment-15057800
]
Michal Turek commented on PARQUET-401:
--------------------------------------
Hi, our libraries use generic dependency {{slf4j-api}} and applications
concrete {{logback}}. I started using {{parquet-avro}} library in our
application and found a possibly related issue that is not so obvious. So am
adding a comment for considering during the refactoring.
{noformat}
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-avro</artifactId>
<version>1.8.1</version>
</dependency>
{noformat}
The following lines are appearing in stderr on close of Parquet file
({{ParquetWriter<GenericRecord>}}).
{noformat}
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
details.
{noformat}
After a few hours of debugging and frustration I found the following...
{{parquet-avro}} depends on
{{org.apache.parquet:parquet-format:2.3.0-incubating}} which is shadowing slf4j
library. This means that Java CLASSPATH contains everything from slf4j twice -
for example standard one {{org.slf4j.LoggerFactory}} and shadow
{{parquet.org.slf4j.LoggerFactory}}.
The standard implementation is able to find {{logback}} and properly configure
logging for the application code using {{logback.xml}}, the second one fails
and displays the error log above (including improper package name).
After finding this it was easy to put a breakpoint to the correct place and
find the caller, which is
{{parquet.org.apache.thrift.transport.TIOStreamTransport}}.
I found commit
https://github.com/apache/parquet-format/commit/145f4df56bacd388b77ac9ff62d0b7c3c1b4eab9
and PARQUET-369 that already tries to solve this issue, but I don't think that
adding shaded dependency to slf4j-nop is correct. This only disables logging
for this library, including warning/error log message from
{{TIOStreamTransport.close()}}. Very important information is effectively lost
in such case.
Please consider to implement logging properly and depend only on non-shadowed
{{slf4j-api}} and let the users of your libraries to decide which log messages
to see and which not. I can attach an example source code if you want, but I
think the issue is quite obvious. Please let me know in case of any questions.
> Deprecate Log and move to SLF4J Logger
> --------------------------------------
>
> Key: PARQUET-401
> URL: https://issues.apache.org/jira/browse/PARQUET-401
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
> Affects Versions: 1.8.1
> Reporter: Ryan Blue
>
> The current Log class is intended to allow swapping out logger back-ends, but
> SLF4J already does this. It also doesn't expose as nice of an API as SLF4J,
> which can handle formatting to avoid the cost of building log messages that
> won't be used. I think we should deprecate the org.apache.parquet.Log class
> and move to using SLF4J directly, instead of wrapping SLF4J (PARQUET-305).
> This will require deprecating the current Log class and replacing the current
> uses of it with SLF4J.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)