Hi Jay,
For your immediate use case, will the MR mode work? If that is the case, you 
can take a look at Hive Distcp:
https://gobblin.readthedocs.io/en/latest/case-studies/Hive-Distcp/

For GOBBLIN-714, can you attach any relevant stacktraces that you see in the 
cluster logs that indicate the failure of the jobs? It is interesting that the 
Job execution state for most of the jobs is shown as COMMITTED as opposed to 
SUCCESSFUL.

Thanks,
Sudarshan


________________________________
From: Jay Sen <[email protected]>
Sent: Tuesday, April 2, 2019 8:02 PM
To: Sudarshan Vasudevan; [email protected]
Subject: Re: Gobblin on Yarn ?

Thanks Sudarshan for sharing the info.

I started playing around gobblin cluster ( master/worker) mode and came across 
some weird issues, ( 
GOBBLIN-714<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGOBBLIN-714&data=02%7C01%7Csuvasudevan%40linkedin.com%7C74cc6467fa994b99451808d6b7e0e273%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636898573835526464&sdata=D4n7%2Fu2pZ6a95dwZ0d8%2Fc8ht%2BrbQjQND%2BPpfu%2FM5OdA%3D&reserved=0>
 & 
GOBBLIN-711<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGOBBLIN-711&data=02%7C01%7Csuvasudevan%40linkedin.com%7C74cc6467fa994b99451808d6b7e0e273%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636898573835526464&sdata=r8nF3zNWl5D4it5GS0lLk0bWlMDjr%2FZYHWbgyMchyQI%3D&reserved=0>
 ).

I assume the standalone mode is limited to single node ( may be multi process 
), so I really need cluster environment capable for tolerating node failures, 
etc...

the immediate use-case i am looking at us hive to hive with overall 10TB a day.

Pls let me know ur thoughts.

Thanks
Jay

On Sun, Mar 31, 2019 at 8:29 PM Sudarshan Vasudevan 
<[email protected]<mailto:[email protected]>> wrote:
Hi Jay,
We run both Gobblin Cluster and Gobblin Standalone in production, which are 
both fairly stable. We also run Gobblin pipelines in Mapreduce mode in 
production.

There is some recent interest to revive Gobblin-on-Yarn for a few internal use 
cases. We will hopefully have something to share on that front. So stay tuned!

If you share more details about your use case (e.g. details about the 
source/sink, volume of data to be moved), that will help us point you in the 
right direction.

Best,
Sudarshan
________________________________
From: Jay Sen <[email protected]<mailto:[email protected]>>
Sent: Sunday, March 31, 2019 7:07 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: Gobblin on Yarn ?

Hi All,

What would be the most stable mode in gobblin to run on production ?
cluster ( master + worker ) or standalone or any other ?

what is the mode you are running on prod ? can u guys pls share ?

Thanks
Jay


On Wed, Feb 27, 2019 at 6:16 PM Jay Sen 
<[email protected]<mailto:[email protected]>> wrote:

> Hi,
>
> anybody running Gobblin on yarn mode in production or even in dev
> environment ? can u share pls the experience?
>
> looking for some data points on how it would be beneficial over standalone.
>
> Thanks
> Jay
>

Reply via email to