Ian,
The LZFOutputStream's large byte buffer is sort of annoying. It is much
smaller if you use the Snappy one. The downside of the Snappy one is
slightly less compression (I've seen 10 - 20% larger sizes).
If we can find a compression scheme implementation that doesn't do very
large buffers,
Qingyang,
Are you asking Spark or Shark (The first email was Shark, the last email
was Spark.)?
Best,
Haoyuan
On Wed, Jul 9, 2014 at 7:40 PM, qingyang li liqingyang1...@gmail.com
wrote:
could i set some cache policy to let spark load data from tachyon only one
time for all sql query? for
It should be possible to improve cluster launch time if we are careful
about what commands we run during setup. One way to do this would be to
walk down the list of things we do for cluster initialization and see if
there is anything we can do make things faster. Unfortunately this might be
pretty
Shark, thanks for replying.
Let's me clear my question again.
--
i create a table using create table xxx1
tblproperties(shark.cache=tachyon) as select * from xxx2
when excuting some sql (for example , select * from xxx1) using shark,
shark will read