Hi,

I want to use Spark as Query engine on HBase with sub second latency.

I am  using Spark 1.3  version. And followed the steps below on Hbase table 
with around 3.5 lac rows :


1.       Mapped the Dataframe to Hbase table .RDDCustomers maps to the hbase 
table which is used to create the Dataframe.

" DataFrame schemaCustomers = sqlInstance

                                                                                
.createDataFrame(SparkContextImpl.getRddCustomers(),
                                                                                
                        Customers.class);"

2.       Used registertemp table i.e" 
schemaCustomers.registerTempTable("customers");"

3.       Running the query on Dataframe using Sqlcontext Instance.

What I am observing is that for a single query on one filter criteria the query 
is taking 7-8 seconds? And the time increases as I am increasing the number of 
rows in Hbase table. Also, there was one time when I was getting query response 
under 1-2 seconds. Seems like strange behavior.
Is this expected behavior from Spark or am I missing something here?
Can somebody help me understand this scenario . Please assist.

Thanks,
Siddharth Ubale,

Reply via email to