Re: Read locality on gobblin jobs

2019-08-15 Thread Kuai Yu
Yes, your understanding is correct. From: Jay Sen Sent: Thursday, August 15, 2019 11:05 AM To: dev@gobblin.incubator.apache.org Subject: Re: Read locality on gobblin jobs Hi Kuai, looks like the inout to the mapreduce is the workunit file, not the actual data

Re: Read locality on gobblin jobs

2019-08-15 Thread Jay Sen
Hi Kuai, looks like the inout to the mapreduce is the workunit file, not the actual data, which will be fetched by the mapper gobblin task onces the mapper container is spinned up. which make sense given the data is not available before the task executes, but in case the data is local in the

Re: Read locality on gobblin jobs

2019-08-12 Thread Kuai Yu
The Helix framework in cluster mode doesn't have data locality concept. I think that is only in YARN/MR mode. From: Jay Sen Sent: Sunday, August 11, 2019 5:25 PM To: dev@gobblin.incubator.apache.org Subject: Read locality on gobblin jobs Hi Dev Team, when

Read locality on gobblin jobs

2019-08-11 Thread Jay Sen
Hi Dev Team, when gobblin runs on cluster mode or MR mode, if the job requires to reads data from the hadoop filesystem which is local, i.e on the same gobblin cluster, does gobblin or Helix figures out the data locality automatically ( as in typical MR job ) ? I doubt if this is the case, but