Re: Tajo storage layer

Jihoon Son Sat, 01 Feb 2014 07:30:20 -0800

Hi, Min

The operation of StorageManagerV2 is as follows. The
ScanSchedulercoordinates read requests for each disk. That is, when it
receives a number
of read requests, it first finds the DiskFileScanScheduler who is assigned
the minimum number of read requests. After that, it assigns a read request
to the found DiskFileScanScheduler. This process is repeated for remaining
read requests. DiskFileScanScheduler creates FileScanRunners for every
assigned request. FileScanRunner just reads data by a fixed size of buffer.
You can see the related issue at
https://issues.apache.org/jira/browse/TAJO-178 and this
figure<https://issues.apache.org/jira/secure/attachment/12602567/tajo_storage_manager.png>will
help you understand.


Although StorageManagerV2 is designed to accelerate the read performance by
scheduling disk scans, its performance was not up to our expectations. As
you said, its thread model is too complex, and it might degrade the
performance. So, StorageManager is mainly used instead of StorageManagerV2.
(StorageManager is used by default).

Thanks,
Jihoon


2014-02-01 Min Zhou <[email protected]>:

> Hi all,
>
> Seems the thread model of tajo storage layer is quite complex.
> Each call of StorageManagerFactory.getStorageManager(TajoConf)  creates
> one instance of StorageManagerV2,  which creates a scan scheduler thread
> and several disk file scan schedulers threads.  Why those threads are
> needed? What's their function?  How do those threads work with file
> scanners?
>
>
> Regards,
> Min
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>

Re: Tajo storage layer

Reply via email to