eric9204 commented on issue #6966:
URL: https://github.com/apache/hudi/issues/6966#issuecomment-1311163178

   @fengjian428 when spark-sql was used to write data to hudi, the deltacommit  
action and compaction action were performed one by one, therefore, they will 
not influence each other. But structured streaming is not, the compaction 
service and writing processes share the same `HoodieWriteConfig`.
   
`hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/RunCompactionActionExecutor.java`
 line 87 `HoodieWriteConfig configCopy = config;`
   
   the `configCopy` used by compaction service just points to the original 
configuration. 
   
   If set hoodie.datasource.write.drop.partition.columns= true, the value of 
`hoodie.avro.schema` in the `config` removed partition fields by write process, 
   
   The compaction service will reset value of `hoodie.avro.schema` 
(`hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/RunCompactionActionExecutor.java`
 line 94  `configCopy.setSchema(schemaPair.getRight(). get()); `). 
   
   So, the value of `hoodie.avro.schema` used by write process was changed too.
   
   after the first successful compaction, the changed avro schema and records 
which was removed partition fields by write process are inconsistent, 
Therefore, this problem arises.
   
   Therefore, another way to solve the problem is deep copy config,instead of 
just pointing to the original 
configuration.(`hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/RunCompactionActionExecutor.java`
 line 87 `HoodieWriteConfig configCopy = config;`)
   
   Or adopt this solution. #7167,This may be a little simpler.
   
   any suggestion ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to