Re: Any possibility to run larger data sets with DirectRunner?

Lukasz Cwik Wed, 25 Sep 2019 12:46:27 -0700

+1 for local execution using Flink.

On Tue, Sep 17, 2019 at 4:24 AM Paweł Kordek <[email protected]>
wrote:


> Hi Steve
>
> Maybe local execution on a Flink cluster will work for you:
> https://beam.apache.org/documentation/runners/flink/ ?
>
> Cheers
> Pawel
>
> On Tue, 17 Sep 2019 at 10:51, Steve973 <[email protected]> wrote:
>
>> Hi, all.  I would like to begin to set up my workflow in Apache Beam, but
>> only run it on a local machine until our system administrators have the
>> capacity to set up an adequate (spark or hadoop) cluster.  From the
>> documentation, I understand that we should be mindful of the memory
>> requirements of a data set that we use, but is there any alternative (of
>> course, at the sacrifice of speed) to using a larger data set with the
>> DirectRunner?  Can we configure it to spill to disk, possibly?
>>
>> Thanks,
>> Steve
>>
>
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you have received this email in error please notify the system manager.
> This message contains confidential information and is intended only for the
> individual named. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> immediately by e-mail if you have received this e-mail by mistake and
> delete this e-mail from your system. If you are not the intended recipient
> you are notified that disclosing, copying, distributing or taking any
> action in reliance on the contents of this information is strictly
> prohibited.
>

Re: Any possibility to run larger data sets with DirectRunner?

Reply via email to