Aleksei Ieshin created HDDS-15586:
-------------------------------------

             Summary: Add freon command to read a user-supplied list of 
existing keys  
                 Key: HDDS-15586
                 URL: https://issues.apache.org/jira/browse/HDDS-15586
             Project: Apache Ozone
          Issue Type: Improvement
          Components: freon
            Reporter: Aleksei Ieshin
            Assignee: Aleksei Ieshin


  h2. Problem                                                                   
                                                                                
                                     
  freon's client read generators can only read keys they themselves generated:  
                                                                                
                                     
  * {{ockg}}/{{ockv}} use prefix+index naming, and {{ockv}} validates every 
read against key-0's digest (assumes all keys have identical content).          
                                         
  * {{SameKeyReader}} ({{ocokr}}) reads one fixed key from many threads.        
                                                                                
                                     
                                                                                
                                                                                
                                     
  There is no freon command that points at an arbitrary, heterogeneous set of 
existing keys (a real dataset already in a bucket) and measures read 
throughput. This is needed for read-path          
  performance and capacity/scaling work, where freshly generated uniform keys 
are page-cache-hot and not representative of production data.                   
                                       
                                                                                
                                                                                
                                     
  h2. Proposed change                                                           
                                                                                
                                     
  Add a freon subcommand {{OzoneClientKeyListReader}} ({{ocklr}}) that:         
                                                                                
                                     
  * takes {{--key-file <path>}} — a local file with one key name per line; 
blank lines and {{#}} comments ignored;                                         
                                          
  * reuses {{BaseFreonGenerator}} — a warm shared {{OzoneClient}}, {{-t}} 
threads, {{-n}} total reads (task i reads keys[i % keys.size()], so {{-n}} 
loops the list), DropWizard timer;              
  * per read calls {{bucket.readKey(key)}}, drains the stream into a fixed 
buffer and counts bytes (no content/digest assumptions); reports the 
{{key-read}} timer plus an aggregate bytes/wall-time 
  MB/s line.                                                                    
                                                                                
                                     
                                                                                
                                                                                
                                     
  It exercises the same end-to-end read path as {{ozone sh key get}} and the 
FileSystem {{open()}} ({{readKey}} -> {{KeyInputStream}} -> 
{{BlockInputStream}} -> {{ChunkInputStream}} -> datanode    
  {{ReadChunk}}), so results reflect the real client read stack. It also 
separates client warmth (JIT + pooled datanode connections) from datanode 
page-cache effects, and {{-t}} drives concurrency 
  to find where read throughput saturates.                                      
                                                                                
                                     
                                                                                
                                                                                
                                     
  h2. Example                                                                   
                                                                                
                                     
  {code}                                                                        
                                                                                
                                     
  ozone freon ocklr -v <volume> -b <bucket> --key-file /tmp/keys.txt -t 8 -n 
160                                                                             
                                        
  {code}                                                                        
                                                                                
                                     
                                                                                
                                                                                
                                     
  h2. Implementation notes                                                      
                                                                                
                                     
  * ~110 LOC in hadoop-ozone/tools, mirrors {{OzoneClientKeyValidator}}; 
registered via {{@MetaInfServices(FreonSubcommand.class)}}. No new 
dependencies. Unit test for key-file parsing included.   
  * Possible refinements from the discussion: per-key MB/s (mean ± stddev), a 
{{--buffer-size}} option, a thread-local read buffer, and/or routing the 
throughput summary through freon's standard   
  report instead of a log line.                                                 
                                                                                
                                     
  * Naming ({{ocklr}}) follows the {{ockv}}/{{ockg}}/{{ocokr}} pattern; open to 
alternatives.                                                                   
                                     
                                                                                
                                                                                
                                     
  Discussed and supported on the community forum: 
https://github.com/apache/ozone/discussions/10460



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to