[ https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
shengjun.li updated ARROW-5924: ------------------------------- Description: cmake_modules/DefineOptions.cmake define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" ON) define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) The corrent sequence is as follow: (1) plasma_client.Create(object_id, size, nullptr, 0, &buff, 1); // where device_num > 0 (2) plasma_client.Seal(object_id); (3) buff = nullptr; (4) plasma_client.Release(object_id); (5) plasma_client.Delete(object_id); To set buff nullptr (step 3) just before release the object (step 4) because CloseIpcBuffer is in its destructor (class CudaBuffer). If a user does not do that promptly, CloseIpcBuffer will be blocked. Then, the following error may occure when another object created. IOError: Cuda Driver API call in /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with code 208: cuIpcOpenMemHandle(&data, *handle, CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) Here is a sample. thread 1: { std::shared_ptr buff; plasma_client1.Create(object_id1, size, nullptr, 0, &buff, 1); plasma_client1.Seal(object_id); // not to set buff nullptr plasma_client1.Release(object_id); plasma_client1.Delete(object_id); // ... do someting else or not to do anything } // let buff auto release here. thread 2: { std::shared_ptr buff; plasma_client2.Create(object_id2, size, nullptr, 0, &buff, 1); // If the address allocated by the server is just the object_id1 released, error occur! } was: cmake_modules/DefineOptions.cmake define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" ON) define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) The corrent sequence is as follow: (1) plasma_client.Create(object_id, size, nullptr, 0, &buff, 1); // where device_num > 0 (2) plasma_client.Seal(object_id); (3) buff = nullptr; (4) plasma_client.Release(object_id); (5) plasma_client.Delete(object_id); To set buff nullptr (step 3) just before release the object (step 4) because CloseIpcBuffer is in its destructor (class CudaBuffer). If a user does not do that promptly, CloseIpcBuffer will be blocked. Then, the following error may occure when another object created. IOError: Cuda Driver API call in /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with code 208: cuIpcOpenMemHandle(&data, *handle, CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) Here is a sample. thread 1: { std::shared_ptr buff; plasma_client1.Create(object_id1, size, nullptr, 0, &buff, 1); plasma_client1.Seal(object_id); // not to set buff nullptr plasma_client1.Release(object_id); plasma_client1.Delete(object_id); // ... do someting else or not to do anything } // let buff auto release here. thread 2: { std::shared_ptr buff; plasma_client2.Create(object_id2, size, nullptr, 0, &buff, 1); // If the address allocated by the server is just the object_id1 released, error occur! } > [C++][Plasma] It is not convenient to release a GPU object > ---------------------------------------------------------- > > Key: ARROW-5924 > URL: https://issues.apache.org/jira/browse/ARROW-5924 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ - Plasma > Affects Versions: 0.14.0 > Reporter: shengjun.li > Priority: Major > Labels: pull-request-available > Fix For: 0.14.1 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > cmake_modules/DefineOptions.cmake > define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA > toolkit)" ON) > define_option(ARROW_PLASMA "Build the plasma object store along with > Arrow" ON) > The corrent sequence is as follow: > (1) plasma_client.Create(object_id, size, nullptr, 0, &buff, 1); // where > device_num > 0 > (2) plasma_client.Seal(object_id); > (3) buff = nullptr; > (4) plasma_client.Release(object_id); > (5) plasma_client.Delete(object_id); > To set buff nullptr (step 3) just before release the object (step 4) because > CloseIpcBuffer is in its destructor (class CudaBuffer). > If a user does not do that promptly, CloseIpcBuffer will be blocked. > Then, the following error may occure when another object created. > IOError: Cuda Driver API call in > /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with > code 208: cuIpcOpenMemHandle(&data, *handle, > CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) > Here is a sample. > thread 1: > { > std::shared_ptr buff; > plasma_client1.Create(object_id1, size, nullptr, 0, &buff, 1); > plasma_client1.Seal(object_id); > // not to set buff nullptr > plasma_client1.Release(object_id); > plasma_client1.Delete(object_id); > // ... do someting else or not to do anything > } > // let buff auto release here. > thread 2: > { > std::shared_ptr buff; > plasma_client2.Create(object_id2, size, nullptr, 0, &buff, 1); > // If the address allocated by the server is just the object_id1 released, > error occur! > } -- This message was sent by Atlassian JIRA (v7.6.14#76016)