Re: Speeding Up Group By Queries

2016-04-11 Thread James Taylor
Hi Amit, If a query doesn't filter on the primary key columns, the entire table must be scanned (hence it'll be slower). Secondary indexes[1] on the non-primary key columns is the way to improve performance for these case. Take a look at this[2] presentation for more detail. Also, a 3 node

Re: Speeding Up Group By Queries

2016-04-11 Thread Amit Shah
Hi Mujtaba, I observed that if the where-clause and group-by queries are applied on the primary key columns, then they are superfast (~ 200 ms). This is not the case with queries that have non-primary key columns in the where clause and group by queries. I tried configuring the bucket cache but

Re: [Error] Spark - Save to Phoenix

2016-04-11 Thread Divya Gehlot
Hi Ricardo, I tried that also still that same error Thanks, Divya On 11 April 2016 at 18:51, Ricardo Crespo wrote: > Yes sorry, I didnt see it at first. > > > Could you test this: > > > Instead of: > > --conf >

Re: [Error] Spark - Save to Phoenix

2016-04-11 Thread Ricardo Crespo
Yes sorry, I didnt see it at first. Could you test this: Instead of: --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-client.jar" using it without quotes: --conf spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-client.jar 2016-04-11

Re: [Error] Spark - Save to Phoenix

2016-04-11 Thread Ricardo Crespo
Yes sorry, I didnt see it at first. Could you test this: 2016-04-11 12:43 GMT+02:00 Divya Gehlot : > Hi Ricardo, > Are you talking about below Phoenix jar ? > --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/ > phoenix-client.jar" > > That also

Re: [Error] Spark - Save to Phoenix

2016-04-11 Thread Divya Gehlot
Hi Ricardo, Are you talking about below Phoenix jar ? --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/ phoenix-client.jar" That also been added. Please refer to my first post . Thanks, Divya On 11 April 2016 at 18:33, Ricardo Crespo wrote: >

Re: [Error] Spark - Save to Phoenix

2016-04-11 Thread Ricardo Crespo
Divya, I check that the error that you are getting is on the executor, could you add this parameter to spark submit: --conf spark.executor.extraClassPath=path_to_phoenix_jar That way the executor should get phoenix in the classpath. Best Regards, Ricardo 2016-04-11 12:15 GMT+02:00 Divya

Re: [Error] Spark - Save to Phoenix

2016-04-11 Thread Divya Gehlot
Hi Ricardo , If would have observed my previous post carefully I am already passing the below jars --jars /usr/hdp/2.3.4.0-3485/phoenix/lib/phoenix-spark-4.4.0.2.3.4. 0-3485.jar,/usr/hdp/2.3.4.0-3485/phoenix/phoenix-client. jar,/usr/hdp/2.3.4.0-3485/phoenix/phoenix-server.jar When I try to test

Understanding Phoenix Query Plans

2016-04-11 Thread Amit Shah
Hi, I am using hbase version 1.0 and phoenix version 4.6. For different queries that we are benchmarking, I am trying to understand the query plan 1. If we execute a where clause with group by query on primary key columns of the table, the plan looks like