xuzhiwen1255 opened a new issue, #5663:
URL: https://github.com/apache/iceberg/issues/5663

   ### Feature Request / Improvement
   
   Current sorting method, if the prefix is the same, such as URL: https://www. 
First or other contents, then the sorting effect of taking the first few bits 
is meaningless
   
   Range is mapped as follows:
   1. Sample the data, partition all the data according to the sampled data, 
and then use the partition number of the data as the Z value of the data
   2. Cross the Z values generated by all ZOrder fields to generate the final Z 
value
   3. Sort all data according to the final Z value
   
   This approach can be a good solution to the two problems of method one
   1. The partition must be an integer starting from 0 (the fields involved in 
generating Z-values should theoretically be positive integers starting from 0 
to generate a good Z-curve)
   2. Data with the same prefix can also be allocated to different partitions 
according to the size of the string to obtain different Z values
   
   
   CC: @rdblue @jackye1995 @namrathamyske @aokolnychyi @rdblue @kbendick 
@szehon-ho @nastra 
   
   ### Query engine
   
   Spark


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to