If you have already a lot of queries then it makes sense to look at Hive (in a
recent version)+TEZ+Llap and all tables in ORC format partitioned and sorted on
filter columns. That would be the most easiest way and can improve performance
significantly .
If you want to use Spark, eg because you
Hi, All,
We are starting to migrate our data to Hadoop platform in hoping to use
'Big Data' technologies to
improve our business.
We are new in the area and want to get some help from you.
Currently all our data is put into Hive and some complicated SQL query
statements are run daily.
We