[
https://issues.apache.org/jira/browse/HBASE-29658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HBASE-29658:
-----------------------------------
Labels: pull-request-available (was: )
> RVV Vectorization Optimization: Performance Enhancement for LZ4 Compression,
> BloomFilter Operations, and Scan Queries
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-29658
> URL: https://issues.apache.org/jira/browse/HBASE-29658
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 2.5.12
> Environment: - HBase Version: 2.5.12
> - Java Version: java-17-openjdk-17.0.11.9-1.eos30.riscv64
> - Operating System: Linux
> - Architecture: RISC-V (with RVV extension support)
> Reporter: jys
> Assignee: jys
> Priority: Major
> Labels: pull-request-available
> Attachments: hbase-rvv-optimization-code.zip
>
> Original Estimate: 216h
> Remaining Estimate: 216h
>
> 1.Overview
> This enhancement adds RISC-V Vector (RVV) optimizations to Apache HBase,
> significantly improving performance for LZ4 compression, BloomFilter
> operations, and scan queries.
> 2.Background
> As RISC-V architecture gains traction in data centers and high-performance
> computing environments, providing native RVV-accelerated performance will
> help establish HBase as a leading database solution for RISC-V ecosystems.
> 3.Implementation Details
> LZ4 Compression Optimization
> - Vectorized hash computation using RVV instructions
> - Parallel dictionary access and batch memory operations
> - JNI integration with runtime RVV support detection
> - Dynamic fallback to standard implementation when RVV unavailable
> BloomFilter Optimization
> - Vectorized bit manipulation and parallel hash computation
> - Batch processing for multiple keys in single vector operations
> - Optimized memory access patterns for vector operations
> - Enhanced performance for bulk BloomFilter operations
> Scan Query Optimization
> - Vectorized byte comparisons and prefix matching
> - Batch data processing and memory copy optimization
> - Enhanced StoreScanner performance with RVV operations
> - Improved CellComparator with vectorized comparisons
> 4.Technical Implementation
> - Conditional compilation using `#if defined(_{_}riscv) &&
> defined({_}_riscv_vector)`
> - Runtime detection with graceful fallback when RVV unavailable
> - Backward compatibility - no impact on existing deployments
> - Built-in performance monitoring and metrics collection
> - JNI integration for native RVV operations
> 5.Testing and Validation
> - Comprehensive unit test suite
> - YCSB benchmark integration with detailed metrics
> - Correctness validation ensuring identical results
> - Cross-platform compatibility verification
> - Performance analysis and optimization validation
> 6.Files Modified/Created
> - Modified: Lz4Compressor.java, BloomFilterChunk.java,
> CompoundBloomFilterWriter.java, StoreScanner.java, CellComparatorImpl.java
> - Created: Lz4Native.java, NativeLoader.java , BloomFilterRvvNative.java,
> ScanRVV.java, RVVByteBufferUtils.java
> - Native implementations: lz4.c, Lz4Native.c, bloomfilter_rvv.c, scan_rvv.c,
> scan_rvv_jni.c
> 7.Ready for Community Review
> The implementation is complete and ready for Apache HBase community review
> and integration.
> I welcome feedback, suggestions, and guidance from the Apache HBase community
> to ensure this contribution meets the highest standards and aligns with the
> project's goals.
> 8.Community Benefits
> - Better developer experience on RISC-V platforms
> - Improved performance for data-intensive workloads
> - Enhanced HBase competitiveness in RISC-V ecosystems
> - Valuable contribution to open-source community
--
This message was sent by Atlassian Jira
(v8.20.10#820010)