Hi Prakash, Here is the url for performance comparison: http://phoenix.apache.org/performance.html
Thanks, Anil Gupta On Wed, Sep 10, 2014 at 9:16 AM, anil gupta <anilgupt...@gmail.com> wrote: > Hi Prakash, > > Please find my reply inline. > > On Tue, Sep 9, 2014 at 11:28 PM, Prakash Hosalli < > prakash.hosa...@syncoms.com> wrote: > >> Hi James/Anil, >> >> >> Regarding the questions you put forward, >> >> 1. Yes we will stored data in Hbase, >> 2. Hive will run over Hbase. >> > Anil: I am not aware of your use case to say how much you can do with > OOTB(Out of the Box) features of Hive and HBase integration. But, when i > tried to use Hive with HBase i could not use it because Hive does not > supports querying a table that has composite rowkeys. In an production > environment, most of the times users have composite rowkeys. Obviously, you > can patch Hive-HBase integration to make it better. Please keep in mind > that Hive is not designed to support HBase(HBase integration is just a > small feature of Hive). In contrast, Phoenix is designed on "Top of HBase" > so you will get much much better integration and optimization of HBase > query. > >> 3. We will be using large amount of data (approximately 10 Million >> of rows/daily to be process). >> > Anil: What kind of processing you will be doing? If you are doing simple > aggregates, that is already supported by Phoenix. You can also have a look > a Phoenix-Pig integration to leverage more analytical power of Pig(Although > Pig is a data flow language and Hive is declarative but you get Pig > integration OOTB.) > >> 4. Right now we have both options open, but primarily we plan to use >> Hive table to serve client request/query on aggregated data. >> > Anil: People primarily use Hive for SQL querying, same can be achieved in > a better way with Phoenix(especially when HBase is your storage). > >> 5. We plan to employ all type of query & we plan to achieve high >> level of low latency. >> > Anil: Phoenix will provide you much better performance on HBase. > >> >> If I understand correctly phoenix will just connect to Hbase >> securely & rely on the Hbase API to extract query reply, therefore Phoenix >> will depend on security mechanisms employed by Hbase API & will not provide >> any security feature by itself. >> > Anil: Yes, that is true. At present, Phoenix does not provides mechanism > to grant/revoke/create/add users. Same can be done using HBase shell and > phoenix will honor those changes. Phoenix is open source so a patch is > always appreciated for new features. > >> >> Kindly correct me if my understanding is wrong. >> >> >> Thanks & Regards, >> Prakash Hosalli >> >> >> -----Original Message----- >> From: James Taylor [mailto:jamestay...@apache.org] >> Sent: Tuesday, September 09, 2014 11:56 PM >> To: user; anil gupta >> Subject: Re: Hive or Phoenix >> >> Hi Prakash, >> If possible, it'd be helpful if you could describe your use case a bit. >> >> Some questions I'd have for you: is the data over which you'd query >> stored in HBase? And if so, would the Hive run over the HBase data? Is the >> data read-only or does it mutate? How much data are we talking about >> (approximately) and what would your typical queries be: point look-ups, >> range scans, or full table scans? >> >> As far as security, HBase provides some more fine grained mechanisms as >> well which you could leverage through HBase APIs. Other than the ability to >> connect to a secure cluster through the connection URL, Phoenix doesn't yet >> provide a SQL wrapper on these HBase APIs. This is how Intuit is leveraging >> Phoenix + security in HBase. Anil Gupta can likely tell you more. >> >> Thanks, >> James >> >> On Tue, Sep 9, 2014 at 9:28 AM, Nicolas Maillard < >> nmaill...@hortonworks.com> wrote: >> > Hello Prakash >> > >> > Considering Hive or Phoenix is a little misleading they di serve >> > different needs, let me break it down as I can. >> > >> > You mention security: >> > Phoenix and hive both work on a secured Hadoop cluster, but Hive with >> > Hive Atz has a more fine grained authorization model. So from that >> > perspective Hive has more features. >> > >> > Query performance >> > On the performance side Phoenix has random read,write access where >> > Hive is a full data access, so no way to read a particular entry >> > unless you read the whole associated file. >> > So Hive is batch or interactive, meaning a couple of tens of seconds >> > to get your answer, where Phoenix can be sub second, the response time >> > will depend greatly on wether part of the pheonix key is in your >> > query. I you do a full table scan response time will suffer. Granted >> > secondary indexes could help you there. >> > >> > SQL Semantics >> > Hive currently has a more rich sql semantics with analytics functions, >> > complex types etc... >> > Phoenix is also more limited than Hive in joins or UDFS >> > >> > So I would use Hive for large data, random analysis and ETL, and pay >> > the price of the response time a little. >> > Phoenix on the other hand is great for large volumes of data where you >> > can set up your schema and especially keys according to specific needs >> > and query patterns, in this situation you would get great query >> performance. >> > >> > To sum up in all honesty both are needed >> > >> > Hope this helps >> > >> > On Tue, Sep 9, 2014 at 4:19 PM, Prakash Hosalli >> > <prakash.hosa...@syncoms.com> wrote: >> >> >> >> >> >> >> >> Hi, >> >> >> >> >> >> >> >> >> >> >> >> Is phoenix as any security layer in it. As we have in >> >> hive. >> >> >> >> >> >> >> >> Getting confuse to go forward with Phoenix or Hive in >> >> production environment in my company. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Thanks & Regards, >> >> >> >> Prakash Hosalli >> >> >> >> Syncoms Bangalore India. >> >> >> >> >> > >> > >> > >> > CONFIDENTIALITY NOTICE >> > NOTICE: This message is intended for the use of the individual or >> > entity to which it is addressed and may contain information that is >> > confidential, privileged and exempt from disclosure under applicable >> > law. If the reader of this message is not the intended recipient, you >> > are hereby notified that any printing, copying, dissemination, >> > distribution, disclosure or forwarding of this communication is >> > strictly prohibited. If you have received this communication in error, >> > please contact the sender immediately and delete it from your system. >> Thank You. >> > > > > -- > Thanks & Regards, > Anil Gupta > -- Thanks & Regards, Anil Gupta