[
https://issues.apache.org/jira/browse/SPARK-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15306645#comment-15306645
]
KaiXinXIaoLei commented on SPARK-8118:
--------------------------------------
[~lian cheng] I run queries using spark-sql --master yarn, and queries is :
{noformat}
create table a(key INT, value String) stored as parquet;
insert overwrite table a select * from text_db.a;
insert overwrite table a select * from text_db.a;
{noformat}
I found there is same log writint to stdout.
{noformat}
vm3:/opt/apache/hadoop/logs/userlogs/application_1464609606092_0001 # cat
container_1464609606092_0001_01_000003/stdout
May 30, 2016 8:01:17 PM INFO: org.apache.parquet.hadoop.ParquetFileReader:
Initiating action with parallelism: 5
May 30, 2016 8:01:19 PM INFO: org.apache.parquet.hadoop.codec.CodecConfig:
Compression: GZIP
{noformat}
If i run
{noformat}
insert overwrite table a select * from text_db.a;
{noformat}
many times, i find the parquet log will be write to stderr.
> Turn off noisy log output produced by Parquet 1.7.0
> ---------------------------------------------------
>
> Key: SPARK-8118
> URL: https://issues.apache.org/jira/browse/SPARK-8118
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 1.4.1, 1.5.0
> Reporter: Cheng Lian
> Assignee: Cheng Lian
> Priority: Minor
> Fix For: 1.5.0
>
>
> Parquet 1.7.0 renames package name to "org.apache.parquet", need to adjust
> {{ParquetRelation.enableLogForwarding}} accordingly to avoid noisy log output.
> A better approach than simply muting these log lines is to redirect Parquet
> logs via SLF4J, so that we can handle them consistently. In general these
> logs are very useful. Esp. when used to diagnosing Parquet memory issue and
> filter push-down.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]