hi, i have faced a difficult problem when using kudu 1.6. my kudu table schema is generally like this: column name:key, type:string, prefix encoding, lz4 compression, primary key column name:value, type:string, lz4 compression
the primary key is built from several parts: 001320_201803220420_00000001 the first part is a unique id, the second part is time format string, the third part is incremental integer(for a unique id and an fixed time, there may exist multi value, so i used this part to distinguish) the table range partition use the first part, split it like below range<005000 005000<= range <010000 010000<= range <015000 015000<= range <020000 ..... ..... 995000<= range when i want to scan data for a unique id and range of time, the lower bound like 001320_201803220420_00000001 and the higher bound like 001320_201803230420_99999999, it takes about 500ms to call kuduScanner.nextRows() and the number of rows it returns is between 20~50. All size of data between the bound is about 8000, so i should call hundreds times nextRows() to fetch all data, and it finally cost several minutes. i don't know why this happened and how to resolve it....maybe the final solution is that i should giving up kudu, using hbase instead...
