partitioning to speed up queries

2014-11-07 Thread Gordon Benjamin
Hi All, I'm using Spark/Shark as the foundation for some reporting that I'm doing and have a customers table with approximately 3 million rows that I've cached in memory. I've also created a partitioned table that I've also cached in memory on a per day basis FROM customers_cached INSERT

Re: Using partitioning to speed up queries in Shark

2014-11-07 Thread Mayur Rustagi
- dev list + user list Shark is not officially supported anymore so you are better off moving to Spark SQL. Shark doesnt support Hive partitioning logic anyways, it has its version of partitioning on in-memory blocks but is independent of whether you partition your data in hive or not. Mayur