wumeibanfa commented on code in PR #52967:
URL: https://github.com/apache/doris/pull/52967#discussion_r2196830098
##########
be/src/util/frame_of_reference_coding.cpp:
##########
@@ -421,28 +421,147 @@ bool ForDecoder<T>::init() {
}
// todo(kks): improve this method by SIMD instructions
+
+template <typename T>
+template <typename U>
+void ForDecoder<T>::bit_unpack_optimize(const uint8_t* input, uint8_t in_num,
int bit_width,
+ T* output) {
+ U s = 0;
+ int valid_bit = 0; // How many valid bits
+ int need_bit = 0; // still need
+ T output_mask = ((static_cast<T>(1)) << bit_width) - 1;
+ int u_size = sizeof(U); // Size of U
+ size_t input_size = (in_num * bit_width + 7) >> 3; // input's size
+ int full_batch_size =
+ (input_size / u_size) * u_size; // Adjust input_size to a
multiple of u_size
+ int tail_count = input_size & (u_size - 1); // The remainder of input_size
modulo u_size.
+ // The number of bits in input to adjust to multiples of 8 and thus more
+ int more_bit = (input_size << 3) - (in_num * bit_width);
+
+ for (int i = 0; i < full_batch_size; i += u_size) {
+ s |= static_cast<U>(input[i]);
+ s <<= 8;
Review Comment:
using reinterpret_cast<const int64_t*>(input + i) for reading 8 bytes, The
result depends on the platform's endianness (usually little-endian), and we
must use this to get the right s.
```c++
int64_t raw = *reinterpret_cast<const int64_t*>(input + i);
U ss = __builtin_bswap64(raw);
```
but for int128, We need to write an additional function to do the size-end
conversion, like
```
__int128 bswap128(__int128 x) {
uint64_t high = static_cast<uint64_t>(x >> 64);
uint64_t low = static_cast<uint64_t>(x & 0xFFFFFFFFFFFFFFFF);
high = __builtin_bswap64(high);
low = __builtin_bswap64(low);
return (__int128(low) << 64) | high;
}
```
Can we do this? Would it produce different results due to endianness
differences?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]