[
https://issues.apache.org/jira/browse/HUDI-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lietong Liu resolved HUDI-1667.
-------------------------------
Resolution: Fixed
> Fix bug when HoodieMergeOnReadRDD read record from base file, Hoodie may set
> non-null value in field which is null if vectorization is enabled.
> -----------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HUDI-1667
> URL: https://issues.apache.org/jira/browse/HUDI-1667
> Project: Apache Hudi
> Issue Type: Bug
> Components: Common Core
> Reporter: Lietong Liu
> Assignee: Lietong Liu
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.6.0
>
>
> When HoodieMergeOnReadRDD read record from base file, will create new
> InternalRow base on requiredStructSchema.
> {code:java}
> //代码占位符
> private def createRowWithRequiredSchema(row: InternalRow): InternalRow = {
> val rowToReturn = new SpecificInternalRow(tableState.requiredStructSchema)
> val posIterator = requiredFieldPosition.iterator
> var curIndex = 0
> tableState.requiredStructSchema.foreach(
> f => {
> val curPos = posIterator.next()
> val curField = row.get(curPos, f.dataType)
> rowToReturn.update(curIndex, curField)
> curIndex = curIndex + 1
> }
> )
> rowToReturn
> }
> {code}
> Hoodie doesn't check isNull when get value from all fields here.
> If vectorization is enabled, which means row is *ColumnarBatchRow*_*.*_
> ***ColumnarBatchRow* may return non-null value even if value of field is
> null. So, hoodie may set non-null value in field which is null.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)