> KaiGai Kohei: > >It seems to me you are a little bit optimistic. > >Unlike CPU code, GPU-Sorting logic has to reference device memory space, > >so all the data to be compared needs to be transferred to GPU devices. > >Any pointer on host address space is not valid on GPU calculation. > >Amount of device memory is usually smaller than host memory, so your code > >needs a capability to combined multiple chunks that is partially sorted... > >Probably, it is not all here. > > Aren't there algorithms which help you if the device memory is limited and the > data is massive? I have a rough memory because I did a course online, where I > saw algorithms to deal with such problems I suppose. > What I took is a hybrid approach to process data set overs device memory limitation. First, it split input data stream into multiple (= more than or equal to 1) chunks. Second, it kicks kernel of bitonic-sorting with key-comparison function generated on the fly. Third, it kicks dynamic background worker to run merge-sorting logic by CPU. It does not try to handle all the sorting stuff in GPU. The point we should not forget is, CPU/GPU is a way to sorting but not a purpose.
Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei <kai...@ak.jp.nec.com> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers