Hi Zhixiong,

The driver log tips the tasks have completed and Hadoop console display the 
mapper all of failed more 20 times running 50 minutes and haven't stop looks 
like always attemp somethings again and again. It is normal and completed 5 
mappers about 8 minutes when I comment the customizing reporter. This situation 
is the CustomCodahaleReporterFactory interface.

After change reporter to CustomReporterFactory, it is normal.

Zhang Xiuzhu

From: Zhixiong Chen [mailto:[email protected]]
Sent: Friday, June 15, 2018 2:23 AM
To: [email protected]; [email protected]
Subject: Re: Gobblin Metrics

Hi Xiuzhu,

Given an error, I usually find the direct cause from the log, from which, trace 
the reason bottom-up or top-down along job topology. The majority of gobblin 
jobs have similar topology as modeled here: 
https://gobblin.readthedocs.io/en/latest/Gobblin-Architecture/#gobblin-constructs.
Gobblin Architecture Overview - Gobblin 
Documentation<https://gobblin.readthedocs.io/en/latest/Gobblin-Architecture/#gobblin-constructs>
Gobblin Architecture Overview. Gobblin is built around the idea of 
extensibility, i.e., it should be easy for users to add new adapters or extend 
existing adapters to work with new sources and start extracting data from the 
new sources in any deployment settings.
gobblin.readthedocs.io


In you case, did mapreduce complete? If it did, the job was hanging. You can 
get a thread dump using `jstack` utitlity and analyze at which function/method 
the job hung. This step can help you get the direct cause. For root cause 
investigation, you have to get the topology of your job based on your job 
configurations, and for each construct or component, reason if it possibly 
caused the error.

Find exceptions and errors in your mapper log and driver log. They are good 
entry points for reasoning.

Zhixiong
________________________________
From: Zhang, Xiuzhu(AWF) <[email protected]<mailto:[email protected]>>
Sent: Thursday, June 14, 2018 9:35 AM
To: 
[email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>
Subject: RE: Gobblin Metrics


Hi Zhixiong,



Yes, reporting feature can fetched it in MR mode, but now produced more strange 
problem the logs tips the task have completed but the job always running look 
likes don't stop but the data has been published successfully and the mapreduce 
map failed multiple times the successfully times is 0.



I use CustomCodahaleReporterFactory and know the reporter will report in fixed 
rate(Configured and default is 30s), did this is the reason to make job always 
running? Could to control it at this issue?



Here are part of logs:

2018-06-14 09:12:20 PDT INFO  [TaskStateCollectorService RUNNING] 
gobblin.runtime.JobContext  374 - 2 more tasks of job 
job_calmulti1-2018061308_1528990788975 have completed

2018-06-14 09:12:20 PDT INFO  [TaskStateCollectorService RUNNING] 
gobblin.runtime.JobContext  364 - Writing job execution information to the job 
history store

2018-06-14 09:13:20 PDT WARN  [TaskStateCollectorService RUNNING] 
gobblin.runtime.TaskStateCollectorService  131 - No output task state files 
found in 
/user/pp_dt_risk_batch/gobblin-dist-hdp/job/working/calmulti1-2018061308/job_calmulti1-2018061308_1528990788975/output/job_calmulti1-2018061308_1528990788975

2018-06-14 09:14:12 PDT INFO  [main] org.apache.hadoop.mapreduce.Job  1406 - 
Task Id : attempt_1528927363842_14570_m_000004_1, Status : FAILED

2018-06-14 09:14:13 PDT INFO  [main] org.apache.hadoop.mapreduce.Job  1367 -  
map 60% reduce 0%

2018-06-14 09:14:20 PDT WARN  [TaskStateCollectorService RUNNING] 
gobblin.runtime.TaskStateCollectorService  131 - No output task state files 
found in 
/user/pp_dt_risk_batch/gobblin-dist-hdp/job/working/calmulti1-2018061308/job_calmulti1-2018061308_1528990788975/output/job_calmulti1-2018061308_1528990788975



Thanks,

Zhang Xiuzhu





From: Zhixiong Chen [mailto:[email protected]]
Sent: Wednesday, June 13, 2018 1:32 AM
To: [email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>
Subject: Re: Gobblin Metrics



Hi Xiuzhu,



If I understand correctly, you're trying to get counter info in the code 
block(`TaskState#toTaskExecutionInfo`) when the job is running in mapreduce 
mode.



The answer is:



You can't get task metrics from the driver in mapreduce mode. Because, in 
mapreduce mode, `Task` runs on a mapper, which is usually a different machine 
or jvm from the driver. You can't read the task metrics set on a different jvm.



In standalone mode, `Task` runs on the same machine or jvm as the driver, 
that's why you can read the value.



However, if your goal is to fetch task or job metrics, the right way is:

  1.  configure the job to emit metrics to a metrics store, that's why I 
mentioned `reporter` in previous reply. For more information, you can read 
https://gobblin.readthedocs.io/en/latest/metrics/Gobblin-Metrics/

Quick Start - Gobblin 
Documentation<https://gobblin.readthedocs.io/en/latest/metrics/Gobblin-Metrics/>

Gobblin Metrics is a metrics library for emitting metrics and events 
instrumenting java applications. Metrics and events are easy to use and 
enriched with tags.

gobblin.readthedocs.io


  1.
  2.  find the metrics you want in the store.

Thanks,

Zhixiong

________________________________

From: Zhang, Xiuzhu(AWF) <[email protected]<mailto:[email protected]>>
Sent: Monday, June 11, 2018 6:58 PM
To: 
[email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>
Subject: RE: Gobblin Metrics



Hi Zhixiong,



I haven't use configure reporter and haven't sought anything exceptions.



Looks like the TaskState.java responsible to collect metrics of counter type as 
follows:

// Add task metrics
TaskMetrics taskMetrics = TaskMetrics.get(this);
MetricArray metricArray = new MetricArray();

for (Map.Entry<String, ? extends com.codahale.metrics.Metric> entry : 
taskMetrics.getMetricContext().getCounters()
    .entrySet()) {
  Metric counter = new Metric();
  counter.setGroup(MetricGroup.TASK.name());
  counter.setName(entry.getKey());
  
counter.setType(MetricTypeEnum.valueOf(GobblinMetrics.MetricType.COUNTER.name()));
  counter.setValue(Long.toString(((Counter) entry.getValue()).getCount()));
  metricArray.add(counter);
}



Here the MetricContext object can be fetched but the counters size is 0, why 
the metrics added by Counter from MetricContext  can't be fetched on this 
place? Very strange the gobblin definite metrics also haven't be fetched. But 
it is normal under standalone mode.



If it is a bug or need to configure any other something for mapreduce mode?



Thanks,
Zhang Xiuzhu





From: Zhixiong Chen [mailto:[email protected]]
Sent: Tuesday, June 12, 2018 1:19 AM
To: [email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>
Subject: Re: Gobblin Metrics



Hi Xiuzhu,



Which metric reporter are you using?



Did you see exceptions about sending metrics in mapreduce mode?



Zhixiong,

________________________________

From: Zhang, Xiuzhu(AWF) <[email protected]<mailto:[email protected]>>
Sent: Monday, June 11, 2018 2:19 AM
To: [email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>
Subject: Gobblin Metrics



Hi,



Concerning gobblin metrics function, I am encounter a strange issue. I write 
some self-definite metrics using MetricContext object by InstrumentedExtractor 
and InstrumentedDataWriter. Running it at standalone and mapreduce mode, after 
job completed just only standalone have the metrics information in 
gobblin_task_metrics table I don't understand why haven't the metrics under 
mapreduce mode.



Have anyone know where need to care about it?



I use it as follows :



xxxExtractor extends InstrumentedExtractor<xxx, xxx> {

private MetricContext metricContext;



               public xxxExtractor(WorkUnitState state){

               this.metricContext=this.getMetricContext();

}



Private void xxx(){

                              Counter 
counter=metricContext.counter("xx.xxx.xxx");

                              recordsCounter.inc(xxx);

}

}



Thanks,

Zhang Xiuzhu


Reply via email to