[jira] [Commented] (HADOOP-14475) Metrics of S3A don't print out when enable it in Hadoop metrics property file

Steve Loughran (JIRA) Tue, 06 Jun 2017 14:05:36 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-14475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16039637#comment-16039637
 ]


Steve Loughran commented on HADOOP-14475:
-----------------------------------------

bq. 3.that is the issue confused me. I still don't know why the 
filesystem(S3AFileSystem) be initialized multiple times in a MR job. From 
AzureFileSystem and DataNodeMetric, their filesystem and MetricSystem should be 
only initialized once.

every connection to a different bucket will have its own FS instance, with its 
own settings; if your mapper or reducer is working with >1 bucket, you use >1 
fs. This is more obvious in things like Hive and Spark where processes are 
handling many requests from different people, and FS are actually stored 
separately for each person as well as each bucket (have a look at 
FileSystem.get())  You'd get the same with azure trying to talk to different 
buckets in the same process too. 

> Metrics of S3A don't print out  when enable it in Hadoop metrics property file
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-14475
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14475
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 2.8.0
>         Environment: uname -a
> Linux client01 4.4.0-74-generic #95-Ubuntu SMP Wed Apr 12 09:50:34 UTC 2017 
> x86_64 x86_64 x86_64 GNU/Linux
>  cat /etc/issue
> Ubuntu 16.04.2 LTS \n \l
>            Reporter: Yonger
>         Attachments: s3a-metrics.patch1
>
>
> *.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink
> #*.sink.file.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink
> #*.sink.influxdb.url=http:/xxxxxxxxxx
> #*.sink.influxdb.influxdb_port=8086
> #*.sink.influxdb.database=hadoop
> #*.sink.influxdb.influxdb_username=hadoop
> #*.sink.influxdb.influxdb_password=hadoop
> #*.sink.ingluxdb.cluster=c1
> *.period=10
> #namenode.sink.influxdb.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink
> #S3AFileSystem.sink.influxdb.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink
> S3AFileSystem.sink.file.filename=s3afilesystem-metrics.out
> I can't find the out put file even i run a MR job which should be used s3.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-14475) Metrics of S3A don't print out when enable it in Hadoop metrics property file

Reply via email to