There is no performance benchmarking done for hawq before to compare the two schedulers.
For typciall transaction database workload, deadline is better ( https://blog.pgaddict.com/posts/postgresql-io-schedulers-cfq-noop-deadline). But hawq workload is typically sequential IO instead of random read. So I think it deserves a benchmark to show which is better. Cheers Lei On Fri, Sep 23, 2016 at 12:57 AM, Taylor Vesely <[email protected]> wrote: > Hi All, > > I was running hawq check on a system, and I hit the following error: > > 20160909:16:34:48:339941 gpcheck:hdw1:gpadmin-[ERROR]:-host(hdw1): on > device (sdd) IO scheduler 'cfq' does not match expected value 'deadline' > 20160909:16:34:48:339941 gpcheck:hdw1-[ERROR]:-host(hdw1): on device (sde) > IO scheduler 'cfq' does not match expected value 'deadline' > > I did a bit of research, and generally I see hadoop hardware guides > recommend cfq as the I/O scheduler, rather than deadline. > > http://amd-dev.wpengine.netdna-cdn.com/wordpress/ > media/2012/10/Hadoop_Tuning_Guide-Version5.pdf > - Page 18 > > http://www.datanubes.com/mediac/HadoopTuningDHT.pdf - Page 9 > > Have we done any actual benchmarking for HAWQ I/O schedulers? Did we > account for different use cases? Is deadline actually recommended for > systems that run HAWQ, or is this recommendation just a holdover from the > port from Greenplum? > > Thanks, > > Taylor Vesely >
