[
https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
shengjun.li updated ARROW-5924:
-------------------------------
Description:
cmake_modules/DefineOptions.cmake
define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA
toolkit)" ON)
define_option(ARROW_PLASMA "Build the plasma object store along with Arrow"
ON)
The corrent sequence is as follow:
(1) plasma_client.Create(object_id, size, nullptr, 0, &buff, 1); // where
device_num > 0
(2) plasma_client.Seal(object_id);
(3) buff = nullptr;
(4) plasma_client.Release(object_id);
(5) plasma_client.Delete(object_id);
To set buff nullptr (step 3) just before release the object (step 4) because
CloseIpcBuffer is in its destructor (class CudaBuffer).
If a user does not do that promptly, CloseIpcBuffer will be blocked.
Then, the following error may occure when another object created.
IOError: Cuda Driver API call in
/home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with
code 208: cuIpcOpenMemHandle(&data, *handle,
CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)
was:
cmake_modules/DefineOptions.cmake
define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA
toolkit)" ON)
define_option(ARROW_PLASMA "Build the plasma object store along with Arrow"
ON)
The corrent sequence is as follow:
(1) plasma_client.Create(object_id, size, nullptr, 0, &buff, 1); // where
device_num > 0
(2) plasma_client.Seal(object_id);
(3) buff = nullptr;
(4) plasma_client.Release(object_id);
(5) plasma_client.Delete(object_id);
To set buff nullptr (step 3) just before release the object (step 4) because
CloseIpcBuffer is in its destructor (class CudaBuffer).
If a user does not do that promptly, CloseIpcBuffer will be blocked.
Then, the following error may occure when another object created.
IOError: Cuda Driver API call in
/home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with
code 208: cuIpcOpenMemHandle(&data, *handle,
CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)
Here is a sample.
thread 1:
{
std::shared_ptr buff;
plasma_client1.Create(object_id1, size, nullptr, 0, &buff, 1);
plasma_client1.Seal(object_id);
// not to set buff nullptr
plasma_client1.Release(object_id);
plasma_client1.Delete(object_id);
// ... do someting else or not to do anything
}
// let buff auto release here.
thread 2:
{
std::shared_ptr buff;
plasma_client2.Create(object_id2, size, nullptr, 0, &buff, 1);
// If the address allocated by the server is just the object_id1 released,
error occur!
}
> [C++][Plasma] It is not convenient to release a GPU object
> ----------------------------------------------------------
>
> Key: ARROW-5924
> URL: https://issues.apache.org/jira/browse/ARROW-5924
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++ - Plasma
> Affects Versions: 0.14.0
> Reporter: shengjun.li
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.14.1
>
> Time Spent: 2h 10m
> Remaining Estimate: 0h
>
> cmake_modules/DefineOptions.cmake
> define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA
> toolkit)" ON)
> define_option(ARROW_PLASMA "Build the plasma object store along with
> Arrow" ON)
> The corrent sequence is as follow:
> (1) plasma_client.Create(object_id, size, nullptr, 0, &buff, 1); // where
> device_num > 0
> (2) plasma_client.Seal(object_id);
> (3) buff = nullptr;
> (4) plasma_client.Release(object_id);
> (5) plasma_client.Delete(object_id);
> To set buff nullptr (step 3) just before release the object (step 4) because
> CloseIpcBuffer is in its destructor (class CudaBuffer).
> If a user does not do that promptly, CloseIpcBuffer will be blocked.
> Then, the following error may occure when another object created.
> IOError: Cuda Driver API call in
> /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with
> code 208: cuIpcOpenMemHandle(&data, *handle,
> CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)