Stratosphere is a massively parallel data processing system that is
heavily inspired by database technology. It is based on research
published at leading international scientific conferences (VLDB, Sigmod,
SoCC, CIKM).

It is similar to Spark in many aspects, e.g. it has a Scala API, it
supports complex data flows and very efficiently executes iterative
programs.

A core differences is that it features an optimizer that will for
example automatically choose data shipping and execution strategies for
joins (broadcast/repartition, sort-merge/hybrid-hash join). Another
difference is that its operators are designed to work in memory but
gracefully go out of core under memory pressure.

Checkout the feature overview on the start page of http://stratosphere.eu/

On 23.11.2013 01:17, Ankur Chauhan wrote:
> Hi,
> 
> That's what I thought but as per the slides on http://www.stratosphere.eu 
> they seem to "know" about spark and the scala api does look similar.
> I found the PACT model interesting. Would like to know if matei or other core 
> comitters have something to weight in on.
> 
> -- Ankur
> On 22 Nov 2013, at 16:05, Patrick Wendell <[email protected]> wrote:
> 
>> I've never seen that project before, would be interesting to get a
>> comparison. Seems to offer a much lower level API. For instance this
>> is a wordcount program:
>>
>> https://github.com/stratosphere/stratosphere/blob/master/pact/pact-examples/src/main/java/eu/stratosphere/pact/example/wordcount/WordCount.java
>>
>> On Thu, Nov 21, 2013 at 3:15 PM, Ankur Chauhan <[email protected]> 
>> wrote:
>>> Hi,
>>>
>>> I was just curious about https://github.com/stratosphere/stratosphere
>>> and how does spark compare to it. Anyone has any experience with it to make
>>> any comments?
>>>
>>> -- Ankur
> 

Reply via email to