codope opened a new pull request, #8342:
URL: https://github.com/apache/hudi/pull/8342

   ### Change Logs
   
   Clustering on a bootstrap table (`METADATA_ONLY` bootstrap mode) with row 
writer disabled did not show correct results. Only meta-fields were populated, 
while data columns were null. This PR fixes the bug. It adds a separate 
`HoodieBootstrapFileReader` that stitches the meta columns with the data 
columns.
   
   Before this fix, snapshot query after clustering on bootstrap table:
   ```
   
+-------------------+--------------------+------------------+----------------------+--------------------------------------------------------------------+-------------+--------+--------------+--------+---------+--------------------+-------------------+--------------------+---------------------+-------------------------+---------------------------+------------------+------------+
   
|_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key|_hoodie_partition_path|_hoodie_file_name
                                                   |timestamp    
|_row_key|partition_path|rider   |driver   |begin_lat           |begin_lon      
    |end_lat             |end_lon              |fare                     
|tip_history                |_hoodie_is_deleted|datestr     |
   
+-------------------+--------------------+------------------+----------------------+--------------------------------------------------------------------+-------------+--------+--------------+--------+---------+--------------------+-------------------+--------------------+---------------------+-------------------------+---------------------------+------------------+------------+
   |00000000000001     |00000000000001_5_0  |trip_0            |datestr=2018    
      |a80c61fe-89a1-488c-8daa-dac6cea52dfb_5-10-37_00000000000001.parquet 
|1680265327825|trip_0  |1680265327825 |rider_0 |driver_0 |0.2909073141582583  
|0.6713659942455674 |0.3199873855402988  |0.8901008450132192   
|[13.409427386679251, USD]|[[76.98430157746769, USD]] |false             
|datestr=2018|
   |00000000000001     |00000000000001_7_0  |trip_3            |datestr=2018    
      |45b13f98-496e-4107-8c66-5ce88ab69940_7-10-39_00000000000001.parquet 
|1680265327825|trip_3  |1680265327825 |rider_3 |driver_3 |0.13139874521266626 
|0.9288890012418678 |0.19960441648570804 |0.028970072867536834 
|[3.934944937321838, USD] |[[60.94692580064911, USD]] |false             
|datestr=2018|
   |00000000000001     |00000000000001_7_1  |trip_4            |datestr=2018    
      |45b13f98-496e-4107-8c66-5ce88ab69940_7-10-39_00000000000001.parquet 
|1680265327825|trip_4  |1680265327825 |rider_4 |driver_4 |0.19148119051373647 
|0.3121563466437075 |0.07312220393022284 |0.4623498809657779   
|[84.27465303833377, USD] |[[48.54971480008592, USD]] |false             
|datestr=2018|
   |00000000000001     |00000000000001_6_0  |trip_2            |datestr=2018    
      |4b6f5614-cfe9-42cd-bd0c-09667714a6a3_6-10-38_00000000000001.parquet 
|1680265327825|trip_2  |1680265327825 |rider_2 |driver_2 |0.29293250471488286 
|0.8169497077647824 |0.4575395537485407  |0.37034912499009554  
|[65.48417669107184, USD] |[[51.323010501226705, USD]]|false             
|datestr=2018|
   |00000000000001     |00000000000001_4_0  |trip_1            |datestr=2018    
      |2f656a57-d3d3-453b-bea3-beb3f86a2cfc_4-10-36_00000000000001.parquet 
|1680265327825|trip_1  |1680265327825 |rider_1 |driver_1 |0.7593035032651309  
|0.4695942868315275 |0.04062310794619961 |0.7483312940246941   
|[99.53761667379452, USD] |[[36.68130227843157, USD]] |false             
|datestr=2018|
   |00000000000001     |00000000000001_8_0  |trip_6            |datestr=2019    
      |0e43fd89-9294-4630-8f7b-b782f15377b8_8-10-40_00000000000001.parquet 
|1680265327825|trip_6  |1680265327825 |rider_6 |driver_6 |0.6576893480206276  
|0.20124822123740116|0.5587907101480606  |0.0087676912597352   
|[46.3596114051868, USD]  |[[1.4482069738172454, USD]]|false             
|datestr=2019|
   |00000000000001     |00000000000001_9_0  |trip_5            |datestr=2019    
      |1260bd0a-e1b0-469e-9407-c0952a2e5bce_9-10-41_00000000000001.parquet 
|1680265327825|trip_5  |1680265327825 |rider_5 |driver_5 |0.8780482394034513  
|0.45016664520520033|0.1210946590521833  |0.559346262842122    
|[3.980544730087332, USD] |[[11.81867856830614, USD]] |false             
|datestr=2019|
   |00000000000001     |00000000000001_10_0 |trip_7            |datestr=2019    
      
|ca2423ad-40f9-437a-a009-bf5b14cedb34_10-10-42_00000000000001.parquet|1680265327825|trip_7
  |1680265327825 |rider_7 |driver_7 |0.8539282876074638  |0.6288419331027626 
|0.1199959028048404  |0.19234888544292428  |[17.28229998461128, USD] 
|[[77.49172321783067, USD]] |false             |datestr=2019|
   |00000000000001     |00000000000001_11_0 |trip_8            |datestr=2019    
      
|11502732-a705-4f63-9b8e-3ace93d8c9f4_11-10-43_00000000000001.parquet|1680265327825|trip_8
  |1680265327825 |rider_8 |driver_8 |0.5247015895548016  
|0.09543754441513863|0.1510348079622863  |0.3036501516600335   
|[18.50748211199097, USD] |[[80.26618263126355, USD]] |false             
|datestr=2019|
   |00000000000001     |00000000000001_11_1 |trip_9            |datestr=2019    
      
|11502732-a705-4f63-9b8e-3ace93d8c9f4_11-10-43_00000000000001.parquet|1680265327825|trip_9
  |1680265327825 |rider_9 |driver_9 |0.18732285899232892 |0.419057912375039  
|0.9402509062992255  |0.7540875540699798   |[77.90400106882183, USD] 
|[[89.12865661547804, USD]] |false             |datestr=2019|
   |00000000000001     |00000000000001_3_0  |trip_10           |datestr=2020    
      |75960f03-0093-438c-b71c-fe5eb02496e4_3-10-35_00000000000001.parquet 
|1680265327825|trip_10 |1680265327825 |rider_10|driver_10|0.7945595842585961  
|0.849250587072739  |0.8016352053998793  |0.6664019129654204   
|[68.54476863463951, USD] |[[78.73973533402236, USD]] |false             
|datestr=2020|
   |00000000000001     |00000000000001_0_0  |trip_12           |datestr=2020    
      |920c7f2e-0cc9-46b6-8780-3e3312ef133c_0-10-32_00000000000001.parquet 
|1680265327825|trip_12 |1680265327825 |rider_12|driver_12|0.26359097652813546 
|0.3040963404277949 |0.783608220421833   |0.26773327561669813  
|[8.899266098961778, USD] |[[63.19151746906088, USD]] |false             
|datestr=2020|
   |00000000000001     |00000000000001_1_0  |trip_13           |datestr=2020    
      |3cc87619-d56d-4a6d-9023-8af97824bfac_1-10-33_00000000000001.parquet 
|1680265327825|trip_13 |1680265327825 
|rider_13|driver_13|0.037809287288638194|0.20234037038861052|0.7404294591470656 
 |0.29316985501104065  |[93.45037833211967, USD] |[[50.56012365982448, USD]] 
|false             |datestr=2020|
   |00000000000001     |00000000000001_1_1  |trip_14           |datestr=2020    
      |3cc87619-d56d-4a6d-9023-8af97824bfac_1-10-33_00000000000001.parquet 
|1680265327825|trip_14 |1680265327825 |rider_14|driver_14|0.7519002026514892  
|0.9448162986968871 |0.40054933992868635 
|0.0038455626793925113|[15.880759811433354, USD]|[[84.44445639423378, USD]] 
|false             |datestr=2020|
   |00000000000001     |00000000000001_2_0  |trip_11           |datestr=2020    
      |50b2f640-e284-49ef-a1f8-f5a819a0e7be_2-10-34_00000000000001.parquet 
|1680265327825|trip_11 |1680265327825 |rider_11|driver_11|0.23032054239540056 
|0.9100367991551281 |0.022237439482133525|0.08921895796973023  
|[68.27062120012675, USD] |[[39.13358730683697, USD]] |false             
|datestr=2020|
   
+-------------------+--------------------+------------------+----------------------+--------------------------------------------------------------------+-------------+--------+--------------+--------+---------+--------------------+-------------------+--------------------+---------------------+-------------------------+---------------------------+------------------+------------+
   ```
   After this fix:
   ```
   
+-------------------+--------------------+------------------+----------------------+--------------------------------------------------------------------+-------------+--------+--------------+--------+---------+--------------------+-------------------+--------------------+-------------------+-------------------------+---------------------------+------------------+------------+
   
|_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key|_hoodie_partition_path|_hoodie_file_name
                                                   |timestamp    
|_row_key|partition_path|rider   |driver   |begin_lat           |begin_lon      
    |end_lat             |end_lon            |fare                     
|tip_history                |_hoodie_is_deleted|datestr     |
   
+-------------------+--------------------+------------------+----------------------+--------------------------------------------------------------------+-------------+--------+--------------+--------+---------+--------------------+-------------------+--------------------+-------------------+-------------------------+---------------------------+------------------+------------+
   |00000000000001     |00000000000001_4_0  |trip_3            |datestr=2018    
      |bf676b9f-2e9b-48fd-a6a1-26d8139aecd0_4-10-36_00000000000001.parquet 
|1680265436528|trip_3  |1680265436528 |rider_3 |driver_3 |0.5082028544317309  
|0.6186035619925132 
|0.019487324652589844|0.34244473537926823|[92.41566255606341, USD] 
|[[67.8016115074926, USD]]  |false             |datestr=2018|
   |00000000000001     |00000000000001_4_1  |trip_4            |datestr=2018    
      |bf676b9f-2e9b-48fd-a6a1-26d8139aecd0_4-10-36_00000000000001.parquet 
|1680265436528|trip_4  |1680265436528 |rider_4 |driver_4 |0.5182280625084768  
|0.9253109379737152 |0.33233798005862314 |0.7110019996809055 
|[4.44409622575439, USD]  |[[33.869898194219516, USD]]|false             
|datestr=2018|
   |00000000000001     |00000000000001_6_0  |trip_0            |datestr=2018    
      |0f4f79d1-c011-4bc9-9f89-8a4a2197ef56_6-10-38_00000000000001.parquet 
|1680265436528|trip_0  |1680265436528 |rider_0 |driver_0 |0.12176296539745046 
|0.382558364451396  |0.0870559794514496  
|0.27640429152343515|[92.1024811423022, USD]  |[[67.77835365292796, USD]] 
|false             |datestr=2018|
   |00000000000001     |00000000000001_5_0  |trip_2            |datestr=2018    
      |dec41ce1-9fe8-4cf2-99ab-ffa25297a2da_5-10-37_00000000000001.parquet 
|1680265436528|trip_2  |1680265436528 |rider_2 |driver_2 |0.5522660335262106  
|0.7589583434997402 |0.6039198595852253  |0.8361083230362024 
|[78.0609254553147, USD]  |[[27.858948192411514, USD]]|false             
|datestr=2018|
   |00000000000001     |00000000000001_7_0  |trip_1            |datestr=2018    
      |df6c3238-a4f2-4796-b582-2e427c0e1dcd_7-10-39_00000000000001.parquet 
|1680265436528|trip_1  |1680265436528 |rider_1 |driver_1 |0.7389516331004687  
|0.28811408775028   |0.7200780424137405  |0.484662130326595  
|[1.6077573601573025, USD]|[[10.341913607318709, USD]]|false             
|datestr=2018|
   |00000000000001     |00000000000001_8_0  |trip_5            |datestr=2019    
      |37ffa376-a86d-4706-a785-2711fe13aa78_8-10-40_00000000000001.parquet 
|1680265436528|trip_5  |1680265436528 |rider_5 |driver_5 |0.1630151212353752  
|0.27057428081894186|0.3808059886411259  |0.3692283742910598 
|[31.179184715024654, USD]|[[93.96021299492908, USD]] |false             
|datestr=2019|
   |00000000000001     |00000000000001_9_0  |trip_6            |datestr=2019    
      |4b1783be-5cae-4ef9-9030-60cbce595531_9-10-41_00000000000001.parquet 
|1680265436528|trip_6  |1680265436528 |rider_6 |driver_6 |0.5420218856799521  
|0.3717532476763643 |0.7316585090626965  |0.5182677308446296 
|[49.210873427144186, USD]|[[2.034155984429642, USD]] |false             
|datestr=2019|
   |00000000000001     |00000000000001_10_0 |trip_8            |datestr=2019    
      
|8af5538e-bbe9-4291-8814-bc5e356f90dc_10-10-42_00000000000001.parquet|1680265436528|trip_8
  |1680265436528 |rider_8 |driver_8 |0.8253202558194069  |0.8769063071666001 
|0.9978855323416493  |0.07003530632543731|[22.31002279951365, USD] 
|[[30.365612077091576, USD]]|false             |datestr=2019|
   |00000000000001     |00000000000001_10_1 |trip_9            |datestr=2019    
      
|8af5538e-bbe9-4291-8814-bc5e356f90dc_10-10-42_00000000000001.parquet|1680265436528|trip_9
  |1680265436528 |rider_9 |driver_9 |0.31560399915225323 |0.496779058144757  
|0.6974261081429741  |0.9073312408362796 |[87.04727640702991, USD] 
|[[96.17579621323826, USD]] |false             |datestr=2019|
   |00000000000001     |00000000000001_11_0 |trip_7            |datestr=2019    
      
|eea5a961-af77-4834-9abf-73cc5bf20eff_11-10-43_00000000000001.parquet|1680265436528|trip_7
  |1680265436528 |rider_7 |driver_7 |0.08038761693792418 |0.632904243467236  
|0.555660576167659   |0.4872642442124486 |[13.555441426862014, 
USD]|[[10.544626239374132, USD]]|false             |datestr=2019|
   |00000000000001     |00000000000001_3_0  |trip_12           |datestr=2020    
      |1cc534c7-1e6f-4412-bd6c-c1855070974a_3-10-35_00000000000001.parquet 
|1680265436528|trip_12 |1680265436528 
|rider_12|driver_12|0.004079088549327037|0.16874021976709552|0.20828594874323636
 |0.895462473317559  |[92.18052838420539, USD] |[[44.650703399553215, 
USD]]|false             |datestr=2020|
   |00000000000001     |00000000000001_1_0  |trip_11           |datestr=2020    
      |ffc5ecc7-46e9-4fe4-a6ec-7adbf1e31e33_1-10-33_00000000000001.parquet 
|1680265436528|trip_11 |1680265436528 |rider_11|driver_11|0.16583914122830068 
|0.28708446826172784|0.6707401823203576  
|0.20113024584157368|[13.875727591686381, USD]|[[52.648852351275025, 
USD]]|false             |datestr=2020|
   |00000000000001     |00000000000001_0_0  |trip_10           |datestr=2020    
      |09f09c46-8d3a-420d-a7d5-435c6280b161_0-10-32_00000000000001.parquet 
|1680265436528|trip_10 |1680265436528 |rider_10|driver_10|0.7531144860222685  
|0.9217065388363564 |0.12736143989601045 |0.6846542499163221 
|[85.46301785894622, USD] |[[67.94440570686055, USD]] |false             
|datestr=2020|
   |00000000000001     |00000000000001_2_0  |trip_13           |datestr=2020    
      |bb0e66b1-a853-4075-a9ce-f5150d1db17e_2-10-34_00000000000001.parquet 
|1680265436528|trip_13 |1680265436528 |rider_13|driver_13|0.37536133167833274 
|0.13380768426991696|0.7165151686625107  |0.4484507140549401 
|[37.18742431963579, USD] |[[29.610616003915634, USD]]|false             
|datestr=2020|
   |00000000000001     |00000000000001_2_1  |trip_14           |datestr=2020    
      |bb0e66b1-a853-4075-a9ce-f5150d1db17e_2-10-34_00000000000001.parquet 
|1680265436528|trip_14 |1680265436528 |rider_14|driver_14|0.5579756297430776  
|0.39976488479239436|0.4722872205073937  
|0.10655015779417953|[95.73825510010874, USD] |[[94.31603336355222, USD]] 
|false             |datestr=2020|
   
+-------------------+--------------------+------------------+----------------------+--------------------------------------------------------------------+-------------+--------+--------------+--------+---------+--------------------+-------------------+--------------------+-------------------+-------------------------+---------------------------+------------------+------------+
   ```
   
   ### Impact
   
   A bug fix for bootstrap tables.
   
   ### Risk level (write none, low medium or high below)
   
   low
   
   Only when the base file has a bootstrap path in clustering then only the 
`HoodieBootstrapFileReader` will be used.
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
     ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
     changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to