Alexey Kudinkin created HUDI-3812:
-------------------------------------

             Summary: Metadata is not enabled by default on the Read Path
                 Key: HUDI-3812
                 URL: https://issues.apache.org/jira/browse/HUDI-3812
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Alexey Kudinkin
            Assignee: Alexey Kudinkin


While Metadata Table is enabled by default on the Write Path (in 
HoodieMetadataConfig), it's disabled by default on the Read Path (at least in 
Spark).

 

Now with the Data Skipping enabled by default (as of 0.10, actually) it fails 
b/c Data Skipping now solely relies on MT and Column Stats to function.

 

We need to revisit current default configs to make sure they make sense. So 
that we either
 # Switch off Data Skipping by default as well (If we want to go 
ultra-conservative)
 # Switch on Metadata Table by default.

 

Frankly, i can hardly imagine why we'd enable MT on the write path by default, 
but not enable it on the Read Path by default as this will bring the cost of it 
into everyone's flows, but no benefits (out of the box, people will have to 
discover that it's switched off and switch it on themselves, which seems like 
something everyone is likely to do regardless).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to