Re: Which [open-souce] SQL engine atop Hadoop?

Samuel Marks Tue, 03 Feb 2015 00:30:57 -0800

Thanks Devopam,

In my initial post I did mention Presto, with his review:
" can query Hive, Cassandra <http://cassandra.apache.org/>, relational DBs
&etc. Doesn't seem to be designed for low-latency responses across small
clusters, or support UPDATE operations. It is optimized for data
warehousing or analytics¹
<http://prestodb.io/docs/current/overview/use-cases.html>"


Your thoughts?

Best,

Samuel Marks
http://linkedin.com/in/samuelmarks
On 03/02/2015 6:06 pm, "Devopam Mittra" <devo...@gmail.com> wrote:

> hi Samuel,
> You may wish to evaluate Presto (https://prestodb.io/) , which has an
> added advantage of being faster than conventional Hive due to no MR jobs
> being fired.
> It has a dependency on Hive metastore though , through which it derives
> the mechanism to execute the queries directly on source files.
> The only flip side I found was the absence of complex SQL syntax that
> means creating a lot of intermediate tables for little complicated
> calculations (and imho , all calculations become complex sooner than we
> intend them to )
>
> regards
> Devopam
>
> On Tue, Feb 3, 2015 at 10:30 AM, Samuel Marks <samuelma...@gmail.com>
> wrote:
>
>> Alexander: So would you recommend using Phoenix for all but those kind of
>> queries, and switching to Hive+Tez for the rest? - Is that feasible?
>>
>> Checking their documentation, it looks like it just might be:
>> https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration
>>
>> There is some early work on a Hive + Phoenix integration on GitHub:
>> https://github.com/nmaillard/Phoenix-Hive
>>
>> Saurabh: I am sure there are a variety of very good non open-source
>> products on the market :) - However in this thread I am only looking at
>> open-source options. Additionally I am planning on open-sourcing this
>> project I am building using these tools, so it makes even more sense that
>> the entire toolset and their dependencies are also open-source.
>>
>> Best,
>>
>> Samuel Marks
>> http://linkedin.com/in/samuelmarks
>>
>> On Tue, Feb 3, 2015 at 2:33 PM, Saurabh B <saurabh.wri...@gmail.com>
>> wrote:
>>
>>> This is not open source but we are using Vertica and it works very
>>> nicely for us. There is a 1TB community edition but above that it costs
>>> money.
>>> It has really advanced SQL (analytical functions, etc), works like an
>>> RDBMS, has R/Java/C++ SDK and scales nicely. There is a similar option of
>>> Redshift available but Vertica has more features (pattern matching
>>> functions, etc).
>>>
>>> Again, not open source so I would be interested to know what you end up
>>> going with and what your experience is.
>>>
>>> On Mon, Feb 2, 2015 at 12:08 AM, Samuel Marks <samuelma...@gmail.com>
>>> wrote:
>>>
>>>> Well what I am seeking is a Big Data database that can work with Small
>>>> Data also. I.e.: scaleable from one node to vast clusters; whilst
>>>> maintaining relatively low latency throughout.
>>>>
>>>> Which fit into this category?
>>>>
>>>> Samuel Marks
>>>> http://linkedin.com/in/samuelmarks
>>>>
>>>
>>>
>>
>
>
> --
> Devopam Mittra
> Life and Relations are not binary
>

Re: Which [open-souce] SQL engine atop Hadoop?

Reply via email to