BufferedPositionedInputStream is not buffered
---------------------------------------------
Key: PIG-201
URL: https://issues.apache.org/jira/browse/PIG-201
Project: Pig
Issue Type: Bug
Reporter: Mathieu Poumeyrol
BufferedPositionedInputStream is actualy not buffered, leading (I guess) to
constant round trip to dfs as byte are read one by one. I just wrapped the
provided input stream in the constructor in a good old BufferedInputStream.
I measured a 40% performance boost on a script that reads and writes 3.7GB in
dfs through PigStorage on one node. I guess the impact may be greater on a real
hdfs cluster with actual network roundtrips.
FYI, the issue was found while profiling with Yourkit java profiler. Usefull
toy...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.