Harsh J created HIVE-15908:
------------------------------
Summary: OperationLog's LogFile writer should have autoFlush
turned on
Key: HIVE-15908
URL: https://issues.apache.org/jira/browse/HIVE-15908
Project: Hive
Issue Type: Improvement
Components: HiveServer2
Reporter: Harsh J
Assignee: Harsh J
Priority: Minor
The HS2 offers an API to fetch Operation Log results from the maintained
OperationLog file. The reader used inside class OperationLog$LogFile class
reads line-by-line on its input stream, for any lines available from the OS's
file input perspective.
The writer inside the same class uses PrintStream to write to the file in
parallel. However, the PrintStream constructor used sets PrintStream's
{{autoFlush}} feature in an OFF state. This causes the BufferedWriter used by
PrintStream to accumulate 8k worth of bytes in memory as the buffer before
flushing the writes to disk, causing a slowness in the logs streamed back to
the client. Every line must be ideally flushed entirely as-its-written, for a
smoother experience.
I suggest changing the line inside {{OperationLog$LogFile}} that appears as
below:
{code}
out = new PrintStream(new FileOutputStream(file));
{code}
Into:
{code}
out = new PrintStream(new FileOutputStream(file), true);
{code}
This will cause it to use the described autoFlush feature of PrintStream and
make for a better reader-log-results-streaming experience:
https://docs.oracle.com/javase/7/docs/api/java/io/PrintStream.html#PrintStream(java.io.OutputStream,%20boolean)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)