Hadoop 2.8's s3a does a lot more metrics here, most of which you can find on 
HDP-2.5 if you can grab those JARs. Everything comes out as hadoop JMX metrics, 
also readable & aggregatable through a call to FileSystem.getStorageStatistics


Measuring IO time isn't something picked up, because it's actually hard to 
measure in a multihreaded world: you can't just count up the time seconds spent 
talking to s3, because it can actually make things seem unduly negative. I know 
that as we did try adding up upload times & bytes uploaded to give a metric of 
bandwidth, but it turns out to be fairly misleading. If 10 threads each took 
60s to upload a megabyte of data you could conclude that B/W is 1 MB every 10 
minutes...if they were running in parallel it's a B/W of 10MB per minute: 10x 
as fast.


FWIW the main bottlenecks in s3a perf are


1. time for metadata operations. This is shockingly bad: 
http://steveloughran.blogspot.co.uk/2016/12/how-long-does-filesystemexists-take.html
  As well as code improvements up the stack, you can help here by not having 
deep directory structures for partitioning; prefer wider trees.

2.  cost of re-opening HTTPS connection after forward/backward seek. Hadoop 2.8 
s3a does a lot of work here to improve things, through forward seeks of many KB 
before abort/restart the connectin, and an fadvise=random option for max perf 
on column table storage (ORC, Parquet)

3. how s3a waits until close() before uploading data. The 
fs.s3a.fast.output.enabled=true option boosts this, but it's pretty brittle in 
Hadoop 2.7.x as it uses lots of on-heap storage if code is generating faster 
than upload B/W; 2.8 can use HDD as buffering.

4.  time to commit data. This is an O(data) copy server side, at 6-10 MB/s. 
Needs a committer which doesn't do renames. The 1.6 DirectOutputCommitter did, 
but it couldn't handle failure & retry. Future ones will.


-Steve


________________________________
From: Gili Nachum <gilinac...@gmail.com>
Sent: 13 February 2017 06:55
To: user@spark.apache.org
Subject: How to measure IO time in Spark over S3

Hi!

How can I tell IO duration for a Spark application doing R/W from S3 (using S3 
as a filesystem sc.textFile("s3a://...")?
I would like to know the % of time doing IO of the overall app execution time.

Gili.

Reply via email to