Thanks Devopam, In my initial post I did mention Presto, with his review: " can query Hive, Cassandra <http://cassandra.apache.org/>, relational DBs &etc. Doesn't seem to be designed for low-latency responses across small clusters, or support UPDATE operations. It is optimized for data warehousing or analytics¹ <http://prestodb.io/docs/current/overview/use-cases.html>"
Your thoughts? Best, Samuel Marks http://linkedin.com/in/samuelmarks On 03/02/2015 6:06 pm, "Devopam Mittra" <devo...@gmail.com> wrote: > hi Samuel, > You may wish to evaluate Presto (https://prestodb.io/) , which has an > added advantage of being faster than conventional Hive due to no MR jobs > being fired. > It has a dependency on Hive metastore though , through which it derives > the mechanism to execute the queries directly on source files. > The only flip side I found was the absence of complex SQL syntax that > means creating a lot of intermediate tables for little complicated > calculations (and imho , all calculations become complex sooner than we > intend them to ) > > regards > Devopam > > On Tue, Feb 3, 2015 at 10:30 AM, Samuel Marks <samuelma...@gmail.com> > wrote: > >> Alexander: So would you recommend using Phoenix for all but those kind of >> queries, and switching to Hive+Tez for the rest? - Is that feasible? >> >> Checking their documentation, it looks like it just might be: >> https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration >> >> There is some early work on a Hive + Phoenix integration on GitHub: >> https://github.com/nmaillard/Phoenix-Hive >> >> Saurabh: I am sure there are a variety of very good non open-source >> products on the market :) - However in this thread I am only looking at >> open-source options. Additionally I am planning on open-sourcing this >> project I am building using these tools, so it makes even more sense that >> the entire toolset and their dependencies are also open-source. >> >> Best, >> >> Samuel Marks >> http://linkedin.com/in/samuelmarks >> >> On Tue, Feb 3, 2015 at 2:33 PM, Saurabh B <saurabh.wri...@gmail.com> >> wrote: >> >>> This is not open source but we are using Vertica and it works very >>> nicely for us. There is a 1TB community edition but above that it costs >>> money. >>> It has really advanced SQL (analytical functions, etc), works like an >>> RDBMS, has R/Java/C++ SDK and scales nicely. There is a similar option of >>> Redshift available but Vertica has more features (pattern matching >>> functions, etc). >>> >>> Again, not open source so I would be interested to know what you end up >>> going with and what your experience is. >>> >>> On Mon, Feb 2, 2015 at 12:08 AM, Samuel Marks <samuelma...@gmail.com> >>> wrote: >>> >>>> Well what I am seeking is a Big Data database that can work with Small >>>> Data also. I.e.: scaleable from one node to vast clusters; whilst >>>> maintaining relatively low latency throughout. >>>> >>>> Which fit into this category? >>>> >>>> Samuel Marks >>>> http://linkedin.com/in/samuelmarks >>>> >>> >>> >> > > > -- > Devopam Mittra > Life and Relations are not binary >