Thanks for the research Kay! It does seem addressed, and hopefully fixed in that ticket conversation also in https://issues.apache.org/jira/browse/HDFS-4697 So the best thing here is to wait to upgrade to a version of Hadoop that has that fix and then repeating the test right now. That will be quite a while for me (at least early 2015) but I'd be interested in hearing people who are already on CDH5+ attempting to replicate the above experiment.
Cheers, Andrew On Tue, Sep 30, 2014 at 2:26 PM, Kay Ousterhout <kayousterh...@gmail.com> wrote: > Hi Andrew and Gary, > > I've done some experimentation with this and had similar results. I can't > explain the speedup in write performance, but I dug into the read slowdown > and found that enabling short-circuit reads results in Hadoop not doing > read-ahead in the same way. At a high level, when SCR is off, HDFS does > read-ahead on input data, so much of the time spent reading input data is > pipelined with computation. There were some bugs with SCR where, when SCR > was turned on, reading no longer got pipelined, slowing down performance. > In particular, I believe that non-shortcircuited-reads use fadvise to tell > the OS to read the file in the background, which is not done with shirt > circuit reads. This problem is partially described in > https://issues.apache.org/jira/browse/HDFS-5634, a seemingly unrelated > JIRA that mentions this way down in some of comments. This was supposedly > fixed in newer versions of Hadoop but I haven't verified it. > > -Kay > > >> >> ---------- Forwarded message ---------- >> From: Andrew Ash <and...@andrewash.com> >> Date: Tue, Sep 30, 2014 at 1:33 PM >> Subject: Re: Short Circuit Local Reads >> To: Matei Zaharia <matei.zaha...@gmail.com> >> Cc: "user@spark.apache.org" <user@spark.apache.org>, Gary Malouf >> <malouf.g...@gmail.com> >> >> >> Hi Gary, >> >> I gave this a shot on a test cluster of CDH4.7 and actually saw a >> regression in performance when running the numbers. Have you done any >> benchmarking? Below are my numbers: >> >> >> >> Experimental method: >> 1. Write 14GB of data to HDFS via [1] >> 2. Read data multiple times via [2] >> >> >> Experiment 1: run on virtual machines >> >> >> With short-circuit read disabled: >> 14/09/24 15:10:49 INFO spark.SparkContext: Job finished: >> saveAsTextFile at <console>:13, took 344.931469949 s >> 14/09/24 15:11:30 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 18.601568871 s >> 14/09/24 15:11:54 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 16.531909024 s >> 14/09/24 15:12:18 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 17.639692651 s >> 14/09/24 15:12:38 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 16.773438345 s >> >> With short-circuit read enabled: >> 14/09/24 14:28:38 INFO spark.SparkContext: Job finished: >> saveAsTextFile at <console>:13, took 299.511103592 s >> 14/09/24 14:29:17 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 22.459146194 s >> 14/09/24 14:29:44 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 19.806642815 s >> 14/09/24 14:30:11 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 20.284644308 s >> 14/09/24 14:30:40 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 21.720455219 s >> >> >> My summary hear is that enabling short-circuit read caused the write >> to go faster (what?) and caused a slight decrease in read performance, >> from ~17sec to ~20sec. >> >> The VMs were backed by FusionIO drives but I thought maybe there was >> something funky with the VMs so switched to bare hardware in a second >> experiment. >> >> >> Experiment 2: run on bare hardware >> >> With short-circuit read disabled: >> 14/09/24 15:59:11 INFO spark.SparkContext: Job finished: >> saveAsTextFile at <console>:13, took 1605.965203162 s >> 14/09/24 15:59:39 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 11.984355461 s >> 14/09/24 16:00:00 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 11.134712764 s >> 14/09/24 16:00:11 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 8.694292372 s >> 14/09/24 16:00:24 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 9.83986823 s >> >> With short-circuit read enabled: >> 14/09/24 16:23:14 INFO spark.SparkContext: Job finished: >> saveAsTextFile at <console>:13, took 1113.897715871 s >> 14/09/24 16:25:19 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 14.249690605 s >> 14/09/24 16:25:47 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 12.67330165 s >> 14/09/24 16:26:04 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 10.673825924 s >> 14/09/24 16:26:19 INFO spark.SparkContext: Job finished: count at >> <console>:13, took 9.722516379 s >> >> >> This is separate hardware so the numbers are very different (it's not >> just bypassing the VM overhead). >> >> Again, the writes are much faster (1605s -> 1113s) but the reads are >> comparable if not slightly slower (~10.4s -> ~11.8s) >> >> >> >> >> To make sure that short circuit reads were actually working I looked >> at the datanode logs and saw the following line. I think this >> confirms that a) the read was local (127.0.0.1 -> 127.0.0.1) from >> Spark and b) short-circuit read was successfully used ("success: >> true"). >> >> hadoop-datanode-mybox.local.log:2014-09-24 16:26:52,800 INFO >> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: >> 127.0.0.1, dest: 127.0.0.1, op: REQUEST_SHORT_CIRCUIT_FDS, blockid: >> -312380305519226759, srvID: >> DS-96112752-10.201.12.105-50010-1411586696381, success: true >> >> >> Has anyone actually deployed this feature and benchmarked gains? I >> was hoping to throw this switch on my clusters and get a 30% perf >> boost but in practice that has not materialized. >> >> >> Cheers! >> Andrew >> >> >> >> [1] sc.parallelize(1 to (14*1024*1024)).map(k => Seq(k, >> org.apache.commons.lang.RandomStringUtils.random(1024, >> >> "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWxyZ0123456789")).mkString("|")).saveAsTextFile("hdfs:///tmp/output") >> [2] sc.textFile("hdfs:///tmp/output").count >> >> On Wed, Sep 17, 2014 at 11:19 AM, Matei Zaharia <matei.zaha...@gmail.com> >> wrote: >> > >> > I'm pretty sure it does help, though I don't have any numbers for it. >> In any case, Spark will automatically benefit from this if you link it to a >> version of HDFS that contains this. >> > >> > Matei >> > >> > On September 17, 2014 at 5:15:47 AM, Gary Malouf (malouf.g...@gmail.com) >> wrote: >> > >> > Cloudera had a blog post about this in August 2013: >> http://blog.cloudera.com/blog/2013/08/how-improved-short-circuit-local-reads-bring-better-performance-and-security-to-hadoop/ >> > >> > Has anyone been using this in production - curious as to if it made a >> significant difference from a Spark perspective. >> > >