You need the table in an efficient format, such as Orc or parquet. Have the
table sorted appropriately (hint: most discriminating column in the where
clause). Do not use SAN or virtualization for the slave nodes.
Can you please post your query.
I always recommend to avoid single updates where p
It depends on how you fetch the single row. Does your query complex ?
On Thu, Jan 7, 2016 at 12:47 PM, Balaraju.Kagidala Kagidala <
balaraju.kagid...@gmail.com> wrote:
> Hi ,
>
> I am new user to spark. I am trying to use Spark to process huge Hive
> data using Spark DataFrames.
>
>
> I have 5