[
https://issues.apache.org/jira/browse/HADOOP-14475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16057270#comment-16057270
]
Steve Loughran commented on HADOOP-14475:
-----------------------------------------
bq. the name change of context just for distinguish with other attributes, such
as MetricsRegistry and Metrics name. From the following log, it shows using
different names is better than ones with the same name:
{code}
17/06/05 20:32:54 DEBUG impl.MetricsSinkAdapter: Pushing record
S3AFileSystemMetrics.s3a.s3afilesystem to file
{code}
we should be ok with staying with "S3AFileSystemMetrics" for now
bq. 2.after i make a collection the relationship of those classes, i also think
the functions of class S3AFileSystemMetricsSystem can be merge into some
existed class, maybe S3AFileSystem.
{{S3AFileSystem}} is *way to big* right now; we've been pulling everything out
into its own isolated classes wherever possible. It's a losing battle (look at
the HADOOP-13345) branch, but we try. Generally we're doing this with
package-private classes which take {{S3AFileSystem owner}} as a constructor.
Regarding instances
* Calls to {{FileSystem.get(URI, conf)}} or {{Path.getFilesystem(conf)}} will
return the shared FS for that user.
* Unless the relevant system property to create unique instances for every call
has been set.
* We like to share FS instances to allow for sharing of thread pools (s3,
azure) and IPC channels (HDFS), so the unique stuff is generally left for whan
you are changing the Configuration settings and really want new instances.
* Ideally an MR/Hive/spark job should have one instance per user per JVM
* And the MR job can call FileSystem.getStatistics() on the FS after the run to
get the statistics for every FS in the JVM, to get statistics we can then
aggregate across the entire job.
What this means is that MR jobs *should* have one S3AFS instance per VM (single
User app and all), but services such as Hive LLAP will have many instances,
created when queries come in, released afterwards.
> Metrics of S3A don't print out when enable it in Hadoop metrics property file
> ------------------------------------------------------------------------------
>
> Key: HADOOP-14475
> URL: https://issues.apache.org/jira/browse/HADOOP-14475
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 2.8.0
> Environment: uname -a
> Linux client01 4.4.0-74-generic #95-Ubuntu SMP Wed Apr 12 09:50:34 UTC 2017
> x86_64 x86_64 x86_64 GNU/Linux
> cat /etc/issue
> Ubuntu 16.04.2 LTS \n \l
> Reporter: Yonger
> Assignee: Yonger
> Attachments: s3a-metrics.patch1, stdout.zip
>
>
> *.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink
> #*.sink.file.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink
> #*.sink.influxdb.url=http:/xxxxxxxxxx
> #*.sink.influxdb.influxdb_port=8086
> #*.sink.influxdb.database=hadoop
> #*.sink.influxdb.influxdb_username=hadoop
> #*.sink.influxdb.influxdb_password=hadoop
> #*.sink.ingluxdb.cluster=c1
> *.period=10
> #namenode.sink.influxdb.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink
> #S3AFileSystem.sink.influxdb.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink
> S3AFileSystem.sink.file.filename=s3afilesystem-metrics.out
> I can't find the out put file even i run a MR job which should be used s3.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]