Apache Phoenix is super fast for queries which filters data by table key, - sub-second latency - has good jdbc driver
but has limitations - no full outer join support - inner and left outer join use one computer memory, so it can not join huge table to huge table On Mon, Feb 2, 2015 at 6:59 PM, Alexander Pivovarov <apivova...@gmail.com> wrote: > I like Tez engine for hive (aka Stinger initiative) > > - faster than MR engine. especially for complex queries with lots of > nested sub-queries > - stable > - min latency is 5-7 sec (0 sec for select count(*) ...) > - capable to process huge datasets (not limited by RAM as Spark) > > > On Mon, Feb 2, 2015 at 6:00 PM, Samuel Marks <samuelma...@gmail.com> > wrote: > >> Maybe you're right, and what I should be doing is throwing in connectors >> so that data from regular databases is pushed into HDFS at regular >> intervals, wherein my "fancier" analytics can be run across larger >> data-sets. >> >> However, I don't want to decide straightaway, for example, Phoenix + >> Spark may be just the combination I am looking for. >> >> Best, >> >> >> Samuel Marks >> http://linkedin.com/in/samuelmarks >> >> On Mon, Feb 2, 2015 at 5:14 PM, Jörn Franke <jornfra...@gmail.com> wrote: >> >>> Hallo, >>> >>> I think you have to think first about your functional and non-functional >>> requirements. You can scale "normal" SQL databases as well (cf CERN or >>> Facebook). There are different types of databases for different purposes - >>> there is no one fits it all. At the moment, we are a few years away from a >>> one-fits-it-all database that leverages AI etc to automatically scale, >>> optimize etc processing, storage and network. Until then you will have to >>> do the math depending on your requirements. >>> Once you make them more precise, we will able to help you more. >>> >>> Cheers >>> Le 2 févr. 2015 06:08, "Samuel Marks" <samuelma...@gmail.com> a écrit : >>> >>> Well what I am seeking is a Big Data database that can work with Small >>> Data also. I.e.: scaleable from one node to vast clusters; whilst >>> maintaining relatively low latency throughout. >>> >>> Which fit into this category? >>> >>> Samuel Marks >>> http://linkedin.com/in/samuelmarks >>> >>> >> >