Re: Spark SQL Thriftserver with HBase

Benjamin Kim Sat, 08 Oct 2016 11:27:01 -0700

Mich,

Are you talking about the Phoenix JDBC Server? If so, I forgot about that 
alternative.


Thanks,
Ben


> On Oct 8, 2016, at 11:21 AM, Mich Talebzadeh <mich.talebza...@gmail.com> 
> wrote:
> 
> I don't think it will work
> 
> you can use phoenix on top of hbase
> 
> hbase(main):336:0> scan 'tsco', 'LIMIT' => 1
> ROW                                                       COLUMN+CELL
>  TSCO-1-Apr-08                                            
> column=stock_daily:Date, timestamp=1475866783376, value=1-Apr-08
>  TSCO-1-Apr-08                                            
> column=stock_daily:close, timestamp=1475866783376, value=405.25
>  TSCO-1-Apr-08                                            
> column=stock_daily:high, timestamp=1475866783376, value=406.75
>  TSCO-1-Apr-08                                            
> column=stock_daily:low, timestamp=1475866783376, value=379.25
>  TSCO-1-Apr-08                                            
> column=stock_daily:open, timestamp=1475866783376, value=380.00
>  TSCO-1-Apr-08                                            
> column=stock_daily:stock, timestamp=1475866783376, value=TESCO PLC
>  TSCO-1-Apr-08                                            
> column=stock_daily:ticker, timestamp=1475866783376, value=TSCO
>  TSCO-1-Apr-08                                            
> column=stock_daily:volume, timestamp=1475866783376, value=49664486
> 
> And the same on Phoenix on top of Hvbase table
> 
> 0: jdbc:phoenix:thin:url=http://rhes564:8765 <http://rhes564:8765/>> select 
> substr(to_char(to_date("Date",'dd-MMM-yy')),1,10) AS TradeDate, "close" AS 
> "Day's close", "high" AS "Day's High", "low" AS "Day's Low", "open" AS "Day's 
> Open", "ticker", "volume", (to_number("low")+to_number("high"))/2 AS 
> "AverageDailyPrice" from "tsco" where to_number("volume") > 0 and "high" != 
> '-' and to_date("Date",'dd-MMM-yy') > to_date('2015-10-06','yyyy-MM-dd') 
> order by  to_date("Date",'dd-MMM-yy') limit 1;
> +-------------+--------------+-------------+------------+-------------+---------+-----------+--------------------+
> |  TRADEDATE  | Day's close  | Day's High  | Day's Low  | Day's Open  | 
> ticker  |  volume   | AverageDailyPrice  |
> +-------------+--------------+-------------+------------+-------------+---------+-----------+--------------------+
> | 2015-10-07  | 197.00       | 198.05      | 184.84     | 192.20      | TSCO  
>   | 30046994  | 191.445            |
> 
> HTH
> 
> 
> 
> 
> 
> 
> Dr Mich Talebzadeh
>  
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>  
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
> 
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
> damage or destruction of data or any other property which may arise from 
> relying on this email's technical content is explicitly disclaimed. The 
> author will in no case be liable for any monetary damages arising from such 
> loss, damage or destruction.
>  
> 
> On 8 October 2016 at 19:05, Felix Cheung <felixcheun...@hotmail.com 
> <mailto:felixcheun...@hotmail.com>> wrote:
> Great, then I think those packages as Spark data source should allow you to 
> do exactly that (replace org.apache.spark.sql.jdbc with HBASE one)
> 
> I do think it will be great to get more examples around this though. Would be 
> great if you could share your experience with this!
> 
> 
> _____________________________
> From: Benjamin Kim <bbuil...@gmail.com <mailto:bbuil...@gmail.com>>
> Sent: Saturday, October 8, 2016 11:00 AM
> Subject: Re: Spark SQL Thriftserver with HBase
> To: Felix Cheung <felixcheun...@hotmail.com 
> <mailto:felixcheun...@hotmail.com>>
> Cc: <user@spark.apache.org <mailto:user@spark.apache.org>>
> 
> 
> Felix,
> 
> My goal is to use Spark SQL JDBC Thriftserver to access HBase tables using 
> just SQL. I have been able to CREATE tables using this statement below in the 
> past:
> 
> CREATE TABLE <table-name>
> USING org.apache.spark.sql.jdbc
> OPTIONS (
>   url 
> "jdbc:postgresql://<hostname>:<port>/dm?user=<username>&password=<password>",
>   dbtable "dim.dimension_acamp"
> );
> 
> After doing this, I can access the PostgreSQL table using Spark SQL JDBC 
> Thriftserver using SQL statements (SELECT, UPDATE, INSERT, etc.). I want to 
> do the same with HBase tables. We tried this using Hive and HiveServer2, but 
> the response times are just too long.
> 
> Thanks,
> Ben
> 
> 
> On Oct 8, 2016, at 10:53 AM, Felix Cheung <felixcheun...@hotmail.com 
> <mailto:felixcheun...@hotmail.com>> wrote:
> 
> Ben,
> 
> I'm not sure I'm following completely.
> 
> Is your goal to use Spark to create or access tables in HBASE? If so the link 
> below and several packages out there support that by having a HBASE data 
> source for Spark. There are some examples on how the Spark code look like in 
> that link as well. On that note, you should also be able to use the HBASE 
> data source from pure SQL (Spark SQL) query as well, which should work in the 
> case with the Spark SQL JDBC Thrift Server (with 
> USING,http://spark.apache.org/docs/latest/sql-programming-guide.html#tab_sql_10
>  <http://spark.apache.org/docs/latest/sql-programming-guide.html#tab_sql_10>).
> 
> 
> _____________________________
> From: Benjamin Kim <bbuil...@gmail.com <mailto:bbuil...@gmail.com>>
> Sent: Saturday, October 8, 2016 10:40 AM
> Subject: Re: Spark SQL Thriftserver with HBase
> To: Felix Cheung <felixcheun...@hotmail.com 
> <mailto:felixcheun...@hotmail.com>>
> Cc: <user@spark.apache.org <mailto:user@spark.apache.org>>
> 
> 
> Felix,
> 
> The only alternative way is to create a stored procedure (udf) in database 
> terms that would run Spark scala code underneath. In this way, I can use 
> Spark SQL JDBC Thriftserver to execute it using SQL code passing the key, 
> values I want to UPSERT. I wonder if this is possible since I cannot CREATE a 
> wrapper table on top of a HBase table in Spark SQL?
> 
> What do you think? Is this the right approach?
> 
> Thanks,
> Ben
> 
> On Oct 8, 2016, at 10:33 AM, Felix Cheung <felixcheun...@hotmail.com 
> <mailto:felixcheun...@hotmail.com>> wrote:
> 
> HBase has released support for Spark
> hbase.apache.org/book.html#spark <http://hbase.apache.org/book.html#spark>
> 
> And if you search you should find several alternative approaches.
> 
> 
> 
> 
> 
> On Fri, Oct 7, 2016 at 7:56 AM -0700, "Benjamin Kim" <bbuil...@gmail.com 
> <mailto:bbuil...@gmail.com>> wrote:
> 
> Does anyone know if Spark can work with HBase tables using Spark SQL? I know 
> in Hive we are able to create tables on top of an underlying HBase table that 
> can be accessed using MapReduce jobs. Can the same be done using HiveContext 
> or SQLContext? We are trying to setup a way to GET and POST data to and from 
> the HBase table using the Spark SQL JDBC thriftserver from our RESTful API 
> endpoints and/or HTTP web farms. If we can get this to work, then we can load 
> balance the thriftservers. In addition, this will benefit us in giving us a 
> way to abstract the data storage layer away from the presentation layer code. 
> There is a chance that we will swap out the data storage technology in the 
> future. We are currently experimenting with Kudu.
> 
> Thanks,
> Ben
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org 
> <mailto:user-unsubscr...@spark.apache.org>
> 
> 
> 
> 
> 
>

Re: Spark SQL Thriftserver with HBase

Reply via email to