Sorry for the excessive logging. The pushdown logging should only be at the start, is there a particular message that was being repeated per a row?
Thanks, Owen On Mon, Apr 6, 2015 at 9:15 AM, Moore, Douglas < [email protected]> wrote: > On a cluster recently upgraded to Hive 0.14 (HDP 2.2) we found that > Gigabytes and millions more INFO level hive.log entries from ORC packages > were being logged. > > I feel these log entries should be at the DEBUG level. > > Is there an existing bug in Hive or ORC? > > Here is one example: > > 2015-04-06 15:12:43,212 INFO orc.OrcInputFormat > (OrcInputFormat.java:setSearchArgument(298)) - ORC pushdown predicate: > leaf-0 = (EQUALS company XYZ) > > leaf-1 = (EQUALS site DEF) > > leaf-2 = (EQUALS table ABC) > > expr = (and leaf-0 leaf-1 leaf-2) > > > To get an acceptable amount of logging that did not fill /tmp we had to > add these entries to /etc/hive/conf/hive-log4j.settings: > > log4j.logger.org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger=WARN,DRFA > > log4j.logger.org.apache.hadoop.hive.ql.io.orc.ReaderImpl=WARN,DRFA > > log4j.logger.org.apache.hadoop.hive.ql.io.orc.OrcInputFormat=WARN,DRFA > > > > While I'm on the subject, to operationally harden Hive, I think Hive > should use a more aggressive rolling file appender by default, one that > can roll hourly or max size, compress the rolled logs⦠> > > - Douglas >
