Hattonuri opened a new issue, #38881:
URL: https://github.com/apache/arrow/issues/38881

   ### Describe the enhancement requested
   
   I am reading an arrow::ListArray of Decimal128 from 
FixedLengthBinaryArray[16] in parquets and see on flamegraph that 
RawBytesToDecimalBytes consumes large amount of time and mostly due to page 
faults. 
   <img width="1186" alt="image" 
src="https://github.com/apache/arrow/assets/53221537/c9624552-04d7-43bd-be1b-faacbf1664d4";>
   
   I assume that the problem happens here 
   
https://github.com/apache/arrow/blob/eb5de184a7e5d02f98526332ace54250417bd232/cpp/src/parquet/arrow/reader_internal.cc#L559-L576
   when arrow allocates new buffer for decimals and then parses FLBA into it.
   As FLBA type has the same amount of memory per element i guess that this 
function should reuse given array data.
   
   Possible easy(but very dirty) solution could be static_cast given array to 
Decimal array (as it does not have additional fields - it wouldn't be an 
error), const_cast raw_values of it and reverse byte order in-place
   
   
   ### Component(s)
   
   C++, Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to