Hi all, Tajo leverages the feature supported by HDFS-3672, which exposes the disk volume id of each hdfs data block. I already found the related code in DefaultTaskScheduler.assignToLeafTasks, can anyone explain the logic for me? What the scheduler do when the hdfs read is a remote read on the other machine's disk?
Thanks, Min -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. My profile: http://www.linkedin.com/in/coderplay My blog: http://coderplay.javaeye.com
