hi Samuel,
You may wish to evaluate Presto (https://prestodb.io/) , which has an added
advantage of being faster than conventional Hive due to no MR jobs being
fired.
It has a dependency on Hive metastore though , through which it derives the
mechanism to execute the queries directly on source files.
The only flip side I found was the absence of complex SQL syntax that means
creating a lot of intermediate tables for little complicated calculations
(and imho , all calculations become complex sooner than we intend them to )

regards
Devopam

On Tue, Feb 3, 2015 at 10:30 AM, Samuel Marks <samuelma...@gmail.com> wrote:

> Alexander: So would you recommend using Phoenix for all but those kind of
> queries, and switching to Hive+Tez for the rest? - Is that feasible?
>
> Checking their documentation, it looks like it just might be:
> https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration
>
> There is some early work on a Hive + Phoenix integration on GitHub:
> https://github.com/nmaillard/Phoenix-Hive
>
> Saurabh: I am sure there are a variety of very good non open-source
> products on the market :) - However in this thread I am only looking at
> open-source options. Additionally I am planning on open-sourcing this
> project I am building using these tools, so it makes even more sense that
> the entire toolset and their dependencies are also open-source.
>
> Best,
>
> Samuel Marks
> http://linkedin.com/in/samuelmarks
>
> On Tue, Feb 3, 2015 at 2:33 PM, Saurabh B <saurabh.wri...@gmail.com>
> wrote:
>
>> This is not open source but we are using Vertica and it works very nicely
>> for us. There is a 1TB community edition but above that it costs money.
>> It has really advanced SQL (analytical functions, etc), works like an
>> RDBMS, has R/Java/C++ SDK and scales nicely. There is a similar option of
>> Redshift available but Vertica has more features (pattern matching
>> functions, etc).
>>
>> Again, not open source so I would be interested to know what you end up
>> going with and what your experience is.
>>
>> On Mon, Feb 2, 2015 at 12:08 AM, Samuel Marks <samuelma...@gmail.com>
>> wrote:
>>
>>> Well what I am seeking is a Big Data database that can work with Small
>>> Data also. I.e.: scaleable from one node to vast clusters; whilst
>>> maintaining relatively low latency throughout.
>>>
>>> Which fit into this category?
>>>
>>> Samuel Marks
>>> http://linkedin.com/in/samuelmarks
>>>
>>
>>
>


-- 
Devopam Mittra
Life and Relations are not binary

Reply via email to