Cool, will bear that in mind. It's really only a handful of small scripts
that use certain r packages for various reasons.

Cheers,
Andy

On Thu, 13 Jul 2017, 17:30 Maxime Beauchemin, <[email protected]>
wrote:

> Operators as an abstraction for something like R tend to be more
> restrictive than useful. Similarly it's hard to write a useful
> SparkOperator because it will typically simply fetch an artifact and fire
> it up, and people have different ways of storing artifacts so there's not
> much to generalize.
>
> Though I could see that if there are a set of common patterns you use R for
> and want to parameterize and abstract out or "industrialize" then specific
> operators can be useful. "FetchFromS3andRankROperator" or something like
> that makes more sense than a generic ROperator(script) which would be a
> very thin wrapper around BashOperator.
>
> These specific operators are usually specific to your environment and can
> be defined and reused within your DAG repository.
>
> I don't want to start a flame war here but there's a bigger question on
> whether you want to allow running R in production. It's dangerous for many
> reasons that I won't get into here unless we decide to have this
> conversation. Regardless, we do use R in production at Airbnb and would
> recommend using the cgroup features in Airflow and having a dedicated queue
> of workers to insulate abuse and contain resource utilisation. I'd also
> recommend publishing a set of internal rules "When is it ok to use R in
> production" and have engineers do some gatekeeping in source control.
>
> You also may want to consider SparkR as a path to productionize R though
> from my experience data scientists tend to find it too restrictive as it
> doesn't have the bells, whistles and trumpets the desktop R has.
>
> Max
>
> On Thu, Jul 13, 2017 at 7:32 AM, Scott Halgrim <[email protected]>
> wrote:
>
> > This doesn’t really answer your question, but for what it’s worth,
> > virtually our entire pipeline is written in R. We use BashOperators to
> call
> > a templated Rscript call.
> >
> > On Jul 13, 2017, 6:21 AM -0700, Andrew Maguire <[email protected]>,
> > wrote:
> > > Hey,
> > >
> > > I'm sure this has been asked 100's times before.
> > >
> > > Is there any plans for adding R script operators?
> > >
> > > Looks around the contrib part of code base but could'nt find anything.
> > >
> > > Found some tickets in the JIRA but seemed to be from around 2014 and
> > maybe
> > > for stuff that has since been removed.
> > >
> > > I'm porting lots of jobs over to airflow and just trying to assess if
> > worth
> > > redoing them in python, maybe call them with bash operators, or just
> > leave
> > > them in my cron jobs for now.
> > >
> > > Would be happy to help out testing or reviewing anything in any way if
> > > there are efforts ongoing.
> > >
> > > Cheers,
> > > Andy
> >
>

Reply via email to