It seems that the memory leak is caused by other part of my code (which
I thought to be fine), not related to Arrow. I'll check it more and fill
issue if there will be need for.

On 10.11.2020 03:10, Wes McKinney wrote:
The memory should automatically be freed by any object / shared_ptr /
unique_ptr destruction. On Linux we use a background jemalloc thread
by default so it may not be freed immediately but it should not be
held indefinitely. In any case if you can reproduce the issue
consistently we'd be glad to take a look, please open a Jira issue and
provide as much information as you can to make it easy for us to
reproduce

On Mon, Nov 9, 2020 at 9:41 AM Maciej Skrzypkowski
<m.skrzypkow...@gmx.com> wrote:
OK, thanks for the answer.

mArrowTable is "std::shared_ptr<arrow::Table> mArrowTable" so should be managed 
properly by the shared pointer. I've narrowed down the problem to code like this:

void LoadCSVData::ReadArrowTableFromCSV( const std::string & filePath )
{
     auto tableReader = CreateTableReader( filePath );
     //ReadArrowTableUsingReader( *tableReader );
}

std::shared_ptr<arrow::csv::TableReader> LoadCSVData::CreateTableReader( const 
std::string & filePath )
{
     arrow::MemoryPool* pool = arrow::default_memory_pool();
     auto tableReader = arrow::csv::TableReader::Make( pool, OpenCSVFile( 
filePath ),
                                                       *PrepareReadOptions(), 
*PrepareParseOptions(), *PrepareConvertOptions() );
     if ( !tableReader.ok() )
     {
         throw BadParametersException( std::string( "CSV file reader error: " ) 
+ tableReader.status().ToString() );
     }
     return *tableReader;
}

Still memory is getting filled while calling ReadArrowTableFromCSV many times. 
Is the arrow's memory pool freed while destruction of TableReader? Or should I 
free it explicitly?


On 09.11.2020 15:01, Wes McKinney wrote:

We'd prefer to answer questions on the mailing list or Jira (if
something looks like a bug).

There isn't enough detail on the SO question to understand what other
things might be going on, but you are never destroying
this->mArrowTable which is holding on to allocated memory. If the
memory use keeps going up through repeated calls to the CSV reader
that sounds like a possible leak, so we would need to see more
details, including about your platform.

On Mon, Nov 9, 2020 at 2:33 AM Maciej Skrzypkowski
<m.skrzypkow...@gmx.com> wrote:

Hi All!

I don't understand memory management in C++ Arrow API. I have some
memory leaks while using it. I've created Stackoverflow question, maybe
someone would answer it:
https://stackoverflow.com/questions/64742588/how-to-manage-memory-while-reading-csv-using-apache-arrow-c-api
.

Thanks,
Maciej Skrzypkowski

Reply via email to