Any chance you could sketch out the Shark APIs that you use for this?
Matei's response suggests that the preferred API is coming in the next
release (i.e. RDDTable class in 0.8.1). Are you building Shark from the
latest in the repo and using that? Or have you figured out other API
calls
Hi Philip,
There are a few things you can do:
- If you want to avoid the data copy with a CREATE TABLE statement, you can use
CREATE EXTERNAL TABLE, which points to an existing file or directory.
- If you always reuse the same table, you could CREATE TABLE only once and then
simply place
I have a simple scenario that I'm struggling to implement. I would like
to take a fairly simple RDD generated from a large log file, perform
some transformations on it, and write the results out such that I can
perform a Hive query either from Hive (via Hue) or Shark. I'm having
troubles