Re: Review Request 11029: HIVE-4513 - disable hivehistory logs by default

2013-08-10 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11029/
---

(Updated Aug. 10, 2013, 4:24 p.m.)


Review request for hive.


Changes
---

HIVE-4513.6.patch - addresses review comments. Fixes race condition that was 
causing the TestHiveServerSessions.testSessionVars test failure.


Bugs: HIVE-4513
https://issues.apache.org/jira/browse/HIVE-4513


Repository: hive-git


Description
---

HiveHistory log files (hive_job_log_hive_*.txt files) store information about 
hive query such as query string, plan , counters and MR job progress 
information.

There is no mechanism to delete these files and as a result they get 
accumulated over time, using up lot of disk space. 
I don't think this is used by most people, so I think it would better to turn 
this off by default. Jobtracker logs already capture most of this information, 
though it is not as structured as history logs.

The change :
A new config parameter hive.session.history.enabled controls if the history-log 
is enabled. By default it is set to false.
SessionState initializes the HiveHIstory object. When this config is set to 
false, it creates a Proxy object that does not do anything. I did this instead 
of having SessionState return null, because that would add null checks in too 
many places. This keeps the code cleaner and avoids possibility of NPE.
As the proxy only works against interfaces, i created a HiveHistory interface, 
moved the implementation to HiveHistoryImpl. static functions were moved to 
HiveHistoryUtil .


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 83f337b 
  conf/hive-default.xml.template 0a6e433 
  hbase-handler/src/test/templates/TestHBaseCliDriver.vm c59e882 
  ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java 97436c5 
  ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryProxyHandler.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryUtil.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryViewer.java fdd56db 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java ab369f0 
  ql/src/test/org/apache/hadoop/hive/ql/history/TestHiveHistory.java a783303 
  ql/src/test/templates/TestCliDriver.vm a6ae6c3 

Diff: https://reviews.apache.org/r/11029/diff/


Testing
---


Thanks,

Thejas Nair



Re: Review Request 11029: HIVE-4513 - disable hivehistory logs by default

2013-08-10 Thread Thejas Nair


 On June 3, 2013, 9:03 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java, line 86
  https://reviews.apache.org/r/11029/diff/2/?file=290954#file290954line86
 
  In case of incorrect config, should this throw an exception instead of 
  silent return, otherwise there will be errors later when something is tried 
  to be written in history file.

Errors will not be there later, as it does not write if the histStream has not 
been initialized. 
I don't think we should fail the query just because hive history logging 
failed. This is also current behavior in hive.


- Thejas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11029/#review21352
---


On Aug. 10, 2013, 4:24 p.m., Thejas Nair wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11029/
 ---
 
 (Updated Aug. 10, 2013, 4:24 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-4513
 https://issues.apache.org/jira/browse/HIVE-4513
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HiveHistory log files (hive_job_log_hive_*.txt files) store information about 
 hive query such as query string, plan , counters and MR job progress 
 information.
 
 There is no mechanism to delete these files and as a result they get 
 accumulated over time, using up lot of disk space. 
 I don't think this is used by most people, so I think it would better to turn 
 this off by default. Jobtracker logs already capture most of this 
 information, though it is not as structured as history logs.
 
 The change :
 A new config parameter hive.session.history.enabled controls if the 
 history-log is enabled. By default it is set to false.
 SessionState initializes the HiveHIstory object. When this config is set to 
 false, it creates a Proxy object that does not do anything. I did this 
 instead of having SessionState return null, because that would add null 
 checks in too many places. This keeps the code cleaner and avoids possibility 
 of NPE.
 As the proxy only works against interfaces, i created a HiveHistory 
 interface, moved the implementation to HiveHistoryImpl. static functions were 
 moved to HiveHistoryUtil .
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 83f337b 
   conf/hive-default.xml.template 0a6e433 
   hbase-handler/src/test/templates/TestHBaseCliDriver.vm c59e882 
   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java 97436c5 
   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryProxyHandler.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryUtil.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryViewer.java 
 fdd56db 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java ab369f0 
   ql/src/test/org/apache/hadoop/hive/ql/history/TestHiveHistory.java a783303 
   ql/src/test/templates/TestCliDriver.vm a6ae6c3 
 
 Diff: https://reviews.apache.org/r/11029/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Thejas Nair