Hi, I want to use Spark as Query engine on HBase with sub second latency.
I am using Spark 1.3 version. And followed the steps below on Hbase table with around 3.5 lac rows : 1. Mapped the Dataframe to Hbase table .RDDCustomers maps to the hbase table which is used to create the Dataframe. " DataFrame schemaCustomers = sqlInstance .createDataFrame(SparkContextImpl.getRddCustomers(), Customers.class);" 2. Used registertemp table i.e" schemaCustomers.registerTempTable("customers");" 3. Running the query on Dataframe using Sqlcontext Instance. What I am observing is that for a single query on one filter criteria the query is taking 7-8 seconds? And the time increases as I am increasing the number of rows in Hbase table. Also, there was one time when I was getting query response under 1-2 seconds. Seems like strange behavior. Is this expected behavior from Spark or am I missing something here? Can somebody help me understand this scenario . Please assist. Thanks, Siddharth Ubale,