Thanks Sudarshan for sharing the info. I started playing around gobblin cluster ( master/worker) mode and came across some weird issues, ( GOBBLIN-714 <https://issues.apache.org/jira/browse/GOBBLIN-714> & GOBBLIN-711 <https://issues.apache.org/jira/browse/GOBBLIN-711> ).
I assume the standalone mode is limited to single node ( may be multi process ), so I really need cluster environment capable for tolerating node failures, etc... the immediate use-case i am looking at us hive to hive with overall 10TB a day. Pls let me know ur thoughts. Thanks Jay On Sun, Mar 31, 2019 at 8:29 PM Sudarshan Vasudevan < [email protected]> wrote: > Hi Jay, > We run both Gobblin Cluster and Gobblin Standalone in production, which > are both fairly stable. We also run Gobblin pipelines in Mapreduce mode in > production. > > There is some recent interest to revive Gobblin-on-Yarn for a few internal > use cases. We will hopefully have something to share on that front. So stay > tuned! > > If you share more details about your use case (e.g. details about the > source/sink, volume of data to be moved), that will help us point you in > the right direction. > > Best, > Sudarshan > ------------------------------ > *From:* Jay Sen <[email protected]> > *Sent:* Sunday, March 31, 2019 7:07 PM > *To:* [email protected] > *Subject:* Re: Gobblin on Yarn ? > > Hi All, > > What would be the most stable mode in gobblin to run on production ? > cluster ( master + worker ) or standalone or any other ? > > what is the mode you are running on prod ? can u guys pls share ? > > Thanks > Jay > > > On Wed, Feb 27, 2019 at 6:16 PM Jay Sen <[email protected]> wrote: > > > Hi, > > > > anybody running Gobblin on yarn mode in production or even in dev > > environment ? can u share pls the experience? > > > > looking for some data points on how it would be beneficial over > standalone. > > > > Thanks > > Jay > > >
