Hi Siddharth, If your data fits into memory, then I'd recommend using a RDBMS. They work great when they can meet your scaling requirements. Thanks, James
On Thursday, September 11, 2014, Siddharth Ubale < siddharth.ub...@syncoms.com> wrote: > Hi Anil, > > > > Thanks for the concise reply. > > Just wanted to take the conversation further and understand what benefits > would phoenix offer in the scenario where we can employ a in memory system > like Apache spark or Impala on top of hive to reduce latency? > > I am asking cos then security could be handled better…. > > Please do share your views. > > > > Thanks, > > Siddharth Ubale > > > > > > *From:* anil gupta [mailto:anilgupt...@gmail.com > <javascript:_e(%7B%7D,'cvml','anilgupt...@gmail.com');>] > *Sent:* Wednesday, September 10, 2014 9:50 PM > *To:* Prakash Hosalli > *Cc:* user@phoenix.apache.org > <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');> > *Subject:* Re: Hive or Phoenix > > > > Hi Prakash, > > Here is the url for performance comparison: > http://phoenix.apache.org/performance.html > > Thanks, > Anil Gupta > > > > On Wed, Sep 10, 2014 at 9:16 AM, anil gupta <anilgupt...@gmail.com > <javascript:_e(%7B%7D,'cvml','anilgupt...@gmail.com');>> wrote: > > Hi Prakash, > > Please find my reply inline. > > > > On Tue, Sep 9, 2014 at 11:28 PM, Prakash Hosalli < > prakash.hosa...@syncoms.com > <javascript:_e(%7B%7D,'cvml','prakash.hosa...@syncoms.com');>> wrote: > > Hi James/Anil, > > > Regarding the questions you put forward, > > 1. Yes we will stored data in Hbase, > 2. Hive will run over Hbase. > > Anil: I am not aware of your use case to say how much you can do with > OOTB(Out of the Box) features of Hive and HBase integration. But, when i > tried to use Hive with HBase i could not use it because Hive does not > supports querying a table that has composite rowkeys. In an production > environment, most of the times users have composite rowkeys. Obviously, you > can patch Hive-HBase integration to make it better. Please keep in mind > that Hive is not designed to support HBase(HBase integration is just a > small feature of Hive). In contrast, Phoenix is designed on "Top of HBase" > so you will get much much better integration and optimization of HBase > query. > > 3. We will be using large amount of data (approximately 10 Million of > rows/daily to be process). > > Anil: What kind of processing you will be doing? If you are doing simple > aggregates, that is already supported by Phoenix. You can also have a look > a Phoenix-Pig integration to leverage more analytical power of Pig(Although > Pig is a data flow language and Hive is declarative but you get Pig > integration OOTB.) > > 4. Right now we have both options open, but primarily we plan to use > Hive table to serve client request/query on aggregated data. > > Anil: People primarily use Hive for SQL querying, same can be achieved > in a better way with Phoenix(especially when HBase is your storage). > > 5. We plan to employ all type of query & we plan to achieve high > level of low latency. > > Anil: Phoenix will provide you much better performance on HBase. > > > If I understand correctly phoenix will just connect to Hbase > securely & rely on the Hbase API to extract query reply, therefore Phoenix > will depend on security mechanisms employed by Hbase API & will not provide > any security feature by itself. > > Anil: Yes, that is true. At present, Phoenix does not provides mechanism > to grant/revoke/create/add users. Same can be done using HBase shell and > phoenix will honor those changes. Phoenix is open source so a patch is > always appreciated for new features. > > > Kindly correct me if my understanding is wrong. > > > Thanks & Regards, > Prakash Hosalli > > -----Original Message----- > From: James Taylor [mailto:jamestay...@apache.org > <javascript:_e(%7B%7D,'cvml','jamestay...@apache.org');>] > Sent: Tuesday, September 09, 2014 11:56 PM > To: user; anil gupta > Subject: Re: Hive or Phoenix > > Hi Prakash, > If possible, it'd be helpful if you could describe your use case a bit. > > Some questions I'd have for you: is the data over which you'd query stored > in HBase? And if so, would the Hive run over the HBase data? Is the data > read-only or does it mutate? How much data are we talking about > (approximately) and what would your typical queries be: point look-ups, > range scans, or full table scans? > > As far as security, HBase provides some more fine grained mechanisms as > well which you could leverage through HBase APIs. Other than the ability to > connect to a secure cluster through the connection URL, Phoenix doesn't yet > provide a SQL wrapper on these HBase APIs. This is how Intuit is leveraging > Phoenix + security in HBase. Anil Gupta can likely tell you more. > > Thanks, > James > > On Tue, Sep 9, 2014 at 9:28 AM, Nicolas Maillard < > nmaill...@hortonworks.com > <javascript:_e(%7B%7D,'cvml','nmaill...@hortonworks.com');>> wrote: > > Hello Prakash > > > > Considering Hive or Phoenix is a little misleading they di serve > > different needs, let me break it down as I can. > > > > You mention security: > > Phoenix and hive both work on a secured Hadoop cluster, but Hive with > > Hive Atz has a more fine grained authorization model. So from that > > perspective Hive has more features. > > > > Query performance > > On the performance side Phoenix has random read,write access where > > Hive is a full data access, so no way to read a particular entry > > unless you read the whole associated file. > > So Hive is batch or interactive, meaning a couple of tens of seconds > > to get your answer, where Phoenix can be sub second, the response time > > will depend greatly on wether part of the pheonix key is in your > > query. I you do a full table scan response time will suffer. Granted > > secondary indexes could help you there. > > > > SQL Semantics > > Hive currently has a more rich sql semantics with analytics functions, > > complex types etc... > > Phoenix is also more limited than Hive in joins or UDFS > > > > So I would use Hive for large data, random analysis and ETL, and pay > > the price of the response time a little. > > Phoenix on the other hand is great for large volumes of data where you > > can set up your schema and especially keys according to specific needs > > and query patterns, in this situation you would get great query > performance. > > > > To sum up in all honesty both are needed > > > > Hope this helps > > > > On Tue, Sep 9, 2014 at 4:19 PM, Prakash Hosalli > > <prakash.hosa...@syncoms.com > <javascript:_e(%7B%7D,'cvml','prakash.hosa...@syncoms.com');>> wrote: > >> > >> > >> > >> Hi, > >> > >> > >> > >> > >> > >> Is phoenix as any security layer in it. As we have in > >> hive. > >> > >> > >> > >> Getting confuse to go forward with Phoenix or Hive in > >> production environment in my company. > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> Thanks & Regards, > >> > >> Prakash Hosalli > >> > >> Syncoms Bangalore India. > >> > >> > > > > > > > > CONFIDENTIALITY NOTICE > > NOTICE: This message is intended for the use of the individual or > > entity to which it is addressed and may contain information that is > > confidential, privileged and exempt from disclosure under applicable > > law. If the reader of this message is not the intended recipient, you > > are hereby notified that any printing, copying, dissemination, > > distribution, disclosure or forwarding of this communication is > > strictly prohibited. If you have received this communication in error, > > please contact the sender immediately and delete it from your system. > Thank You. > > > > > -- > Thanks & Regards, > Anil Gupta > > > > > -- > Thanks & Regards, > Anil Gupta >