Wan Kun created SPARK-44239: ------------------------------- Summary: Reclaim memory allocated by huge column vector Key: SPARK-44239 URL: https://issues.apache.org/jira/browse/SPARK-44239 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.5.0 Reporter: Wan Kun
When spark read data files into WritableColumnVectors, the memory allocated by the WritableColumnVectors will not be free until the VectorizedColumnReader is finished. It will save the memory allocation time though reusing the allocated array object. But it will also occupy too many unused memory after the current large vector batch is already read. Add a vector reserve policy for this scenario, which will use the allocated array object for small column vectors and free up the memory for huge column vectors. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org