[ 
https://issues.apache.org/jira/browse/BEAM-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977150#comment-15977150
 ] 

Mitar commented on BEAM-2026:
-----------------------------

I have not yet done any benchmark, but I would suspect having any extra layer 
in between would make it slower, no?

To me one issue is that Spark adds the whole JVM into the mix. But I see that 
current implementation of Beam direct runner is also based on JVM.

For me personally it is more about how hard is to start using any of this 
distributed technologies. The appeal of Beam to me is that I can for now learn 
the programming model and start developing in it, and then later on, if needed, 
I can scale it by changing the execution runner, and also at that time learn 
about all the details how t deploy Spark or Flink and so on. Probably for 
somebody who knows how to run and use Spark or Flink it does not matter. But 
not everyone does.

In some way I would just prefer to start with programming in Python, but in 
Beam programming model, using Python runner. And then if needed scale it.

> High performance direct runner
> ------------------------------
>
>                 Key: BEAM-2026
>                 URL: https://issues.apache.org/jira/browse/BEAM-2026
>             Project: Beam
>          Issue Type: New Feature
>          Components: runner-direct
>            Reporter: Mitar
>            Assignee: Thomas Groh
>
> In documentation (https://beam.apache.org/documentation/runners/direct/) it 
> is written that direct runner does not try to run efficiently, but it serves 
> mostly for development and debugging.
> I would suggest that there should be also an efficient direct runner. If Beam 
> tries to be an unified programming model, for some smaller tasks I would love 
> to implement them in Beam, just to keep the code in the same model, but it 
> would be OK to run it as a normal smaller program (maybe inside one Docker 
> container), without any distribution across multiple machines. In the future, 
> if usage grows, I could then replace underlying runner with something 
> distributed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to