See http://phoenix.apache.org/ and the Features menu items.
On Sunday, October 23, 2016, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Sorry I forgot you were referring to "multi tenancy"? > > Can you please elaborate on this? > > Thanks > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 23 October 2016 at 22:53, Mich Talebzadeh <mich.talebza...@gmail.com > <javascript:_e(%7B%7D,'cvml','mich.talebza...@gmail.com');>> wrote: > >> Thanks James. >> >> My use case is trade data into HDFS and then put into Hbase via Phoenix. >> This is the batch layer >> >> >> 1. Every row has a UUID as row-key and immutable (append only) >> 2. Source (trade data -> Kafka -> Flume > HDFS. Hdfs directories >> partitioned by DtStamp (daily) >> 3. cron from HDFS -> Phoenix -> Hbase >> 4. cron from HDFS -> Hive ORC tables with partitions >> >> >> For batch data visualisation we have a choice of using >> >> 1. Phoenix JDBC through Zeppelin (limited as Phoenix does not have >> analytics functions (well it can be done with usual joins as well)) >> 2. Hive JDBC through Zeppelin with Analytics support. The best choice >> for SQL . Pretty fast with Hive on Spark execution engine >> 3. Spark sql with Functional programming directly on Hbase >> 4. Spark sql with Hive >> 5. Spark sql does not work on Phoenix (Spark 2 JDBC to Phoenix is >> broken.. I believe a Jira is with Hbase on this) >> >> So we have a resilient design here. Phoenix secondary indexes are also >> very useful. >> >> BTW. After every new append can we run update statistics on Phoenix >> tables and indexes as we do with Hive? >> >> >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> On 23 October 2016 at 22:29, James Taylor <jamestay...@apache.org >> <javascript:_e(%7B%7D,'cvml','jamestay...@apache.org');>> wrote: >> >>> Keep in mind that the CsvBulkLoadTool does not handle updating data >>> in-place. It's expected that the data is unique by row and not updating >>> existing data. If your data is write-once/append-only data, then you'll be >>> ok, but otherwise you should stick with using the JDBC APIs. >>> >>> You're free to just use HBase APIs (maybe that's better for your use >>> case?), but you won't get: >>> - JDBC APIs >>> - SQL >>> - relational data model >>> - parallel execution for your queries >>> - secondary indexes >>> - cross row/cross table transactions >>> - query optimization >>> - views >>> - multi tenancy >>> - query server >>> >>> HBase doesn't store data either, it relies on HDFS to do that. But HDFS >>> eventually stores data in a file system, relying on the OS. >>> >>> Thanks, >>> James >>> >>> On Sun, Oct 23, 2016 at 2:09 PM, Mich Talebzadeh < >>> mich.talebza...@gmail.com >>> <javascript:_e(%7B%7D,'cvml','mich.talebza...@gmail.com');>> wrote: >>> >>>> Thanks Sergey, >>>> >>>> I have modified the design to load data into Hbase through Phoenix >>>> table. In that way both the table in Hbase and the index in Hbase are >>>> maintained. >>>> I assume Phoenix bulkload .CsvBulkLoadTool updates the underlying >>>> table in Hbase plus all the indexes there as well. >>>> >>>> therefore I noticed some ambiguity here >>>> <https://en.wikipedia.org/wiki/Apache_Phoenix>. >>>> >>>> "*Apache Phoenix* is an open source, massively parallel, relational >>>> *database* engine supporting OLTP for Hadoop using *Apache* HBase as >>>> its backing store." >>>> >>>> It is not a database. The underlying data store is Hbase. All Phoenix >>>> does is to allow one to create SQL on top of Hbase to manipulate Hbase >>>> table with DDL and DQ (data query). It does not store data itself. >>>> >>>> I trust this is the correct assessment >>>> >>>> >>>> >>>> Dr Mich Talebzadeh >>>> >>>> >>>> >>>> LinkedIn * >>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>> >>>> >>>> >>>> http://talebzadehmich.wordpress.com >>>> >>>> >>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>> any loss, damage or destruction of data or any other property which may >>>> arise from relying on this email's technical content is explicitly >>>> disclaimed. The author will in no case be liable for any monetary damages >>>> arising from such loss, damage or destruction. >>>> >>>> >>>> >>>> On 23 October 2016 at 21:49, Sergey Soldatov <sergeysolda...@gmail.com >>>> <javascript:_e(%7B%7D,'cvml','sergeysolda...@gmail.com');>> wrote: >>>> >>>>> Hi Mich, >>>>> No, if you update HBase directly, the index will not be maintained. >>>>> Actually I would suggest to ingest data using Phoenix CSV bulk load. >>>>> >>>>> Thanks, >>>>> Sergey. >>>>> >>>>> On Sat, Oct 22, 2016 at 12:49 AM, Mich Talebzadeh < >>>>> mich.talebza...@gmail.com >>>>> <javascript:_e(%7B%7D,'cvml','mich.talebza...@gmail.com');>> wrote: >>>>> >>>>>> Thanks Sergey, >>>>>> >>>>>> In this case the phoenix view is defined on Hbase table. >>>>>> >>>>>> Hbase table is updated every 15 minutes via cron that uses >>>>>> org.apache.hadoop.hbase.mapreduce.ImportTsv to bulk load data into >>>>>> Hbase table, >>>>>> >>>>>> So if I create index on my view in Phoenix, will that index be >>>>>> maintained? >>>>>> >>>>>> regards >>>>>> >>>>>> Dr Mich Talebzadeh >>>>>> >>>>>> >>>>>> >>>>>> LinkedIn * >>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>>> >>>>>> >>>>>> >>>>>> http://talebzadehmich.wordpress.com >>>>>> >>>>>> >>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>> for any loss, damage or destruction of data or any other property which >>>>>> may >>>>>> arise from relying on this email's technical content is explicitly >>>>>> disclaimed. The author will in no case be liable for any monetary damages >>>>>> arising from such loss, damage or destruction. >>>>>> >>>>>> >>>>>> >>>>>> On 21 October 2016 at 23:35, Sergey Soldatov < >>>>>> sergeysolda...@gmail.com >>>>>> <javascript:_e(%7B%7D,'cvml','sergeysolda...@gmail.com');>> wrote: >>>>>> >>>>>>> Hi Mich, >>>>>>> >>>>>>> It's really depends on the query that you are going to use. If >>>>>>> conditions will be applied only by time column you may create index like >>>>>>> create index I on "marketDataHbase" ("timecreated") include >>>>>>> ("ticker", "price"); >>>>>>> If the conditions will be applied on others columns as well, you may >>>>>>> use >>>>>>> create index I on "marketDataHbase" ("timecreated","ticker", >>>>>>> "price"); >>>>>>> >>>>>>> Index is updated together with the user table if you are using >>>>>>> phoenix jdbc driver or phoenix bulk load tools to ingest the data. >>>>>>> >>>>>>> Thanks, >>>>>>> Sergey >>>>>>> >>>>>>> On Fri, Oct 21, 2016 at 4:43 AM, Mich Talebzadeh < >>>>>>> mich.talebza...@gmail.com >>>>>>> <javascript:_e(%7B%7D,'cvml','mich.talebza...@gmail.com');>> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I have a Phoenix table on Hbase as follows: >>>>>>>> >>>>>>>> [image: Inline images 1] >>>>>>>> >>>>>>>> I want to create a covered index to cover the three columns: >>>>>>>> ticker, timecreated, price >>>>>>>> >>>>>>>> More importantly I want the index to be maintained when new rows >>>>>>>> are added to Hbase table. >>>>>>>> >>>>>>>> What is the best way of achieving this? >>>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> Dr Mich Talebzadeh >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> LinkedIn * >>>>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> http://talebzadehmich.wordpress.com >>>>>>>> >>>>>>>> >>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>>>> for any loss, damage or destruction of data or any other property >>>>>>>> which may >>>>>>>> arise from relying on this email's technical content is explicitly >>>>>>>> disclaimed. The author will in no case be liable for any monetary >>>>>>>> damages >>>>>>>> arising from such loss, damage or destruction. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >