[ 
https://issues.apache.org/jira/browse/HUDI-7169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hehuiyuan updated HUDI-7169:
----------------------------
    Description: 
If part filed value is null ,  use `defaultParName` instead of when writing 
data.  

 

When reading:

1. If partVal  equals `defaultParName`,   the true value of part field is null 
and the code return null.
```
Object partVal = partStrArray[idx].equals(defaultParName)
            ? null : DataTypeUtils.resolvePartition(partStrArray[idx], 
partitionTypes.get(idx));
```

ColumnStats columnStats = new ColumnStats(null, null, 1);

 

2. if parVal  equals`defaultParName` and parKey is not equal to 
`defaultParName`, the code  return `defaultParName`
```
Object partVal = partKey.equals(defaultParName)
            ? null : DataTypeUtils.resolvePartition(partStrArray[idx], 
partitionTypes.get(idx));
```

ColumnStats columnStats = new ColumnStats(`defaultParName`,`defaultParName`,  
0);

 

3. Example for IsNotNullevaluator:
```
  public static class IsNotNull extends LeafEvaluator {
    private static final long serialVersionUID = 1L;

    public static IsNotNull getInstance()

{       return new IsNotNull();     }

    @Override
    public boolean eval(Map<String, ColumnStats> columnStatsMap)

{       ColumnStats columnStats = getColumnStats(columnStatsMap);       // 
should consider FLOAT/DOUBLE & NAN       return columnStats.getMinVal() != null 
|| columnStats.getNullCnt() <= 0;     }

  }
```

if part filed value is null ,

apply 2 to 3 , the `eval` method return true

apply 1 to 3 , the `eval` method return false. 

 

4. For example  IsNull Evaluator

```

public static class IsNull extends LeafEvaluator {
private static final long serialVersionUID = 1L;

public static IsNull getInstance() {
return new IsNull();
}

@Override
public boolean eval(Map<String, ColumnStats> columnStatsMap) {
ColumnStats columnStats = getColumnStats(columnStatsMap);
return columnStats.getNullCnt() > 0;
}
}

```

if part filed value is null ,

apply 1 to 4 , the `eval` method return true

apply 2 to 4 , the `eval` method return false. 

 

  was:
1. If partVal  equals `defaultParName`,   the value of part field is null and 
the code return null.
```
Object partVal = partStrArray[idx].equals(defaultParName)
            ? null : DataTypeUtils.resolvePartition(partStrArray[idx], 
partitionTypes.get(idx));
```
ColumnStats columnStats = new ColumnStats(null, null, 1);


2. if parVal  equals`defaultParName` and parKey is not equal to 
`defaultParName`, it will return `defaultParName`
```
Object partVal = partKey.equals(defaultParName)
            ? null : DataTypeUtils.resolvePartition(partStrArray[idx], 
partitionTypes.get(idx));
```
ColumnStats columnStats = new ColumnStats(`defaultParName`,`defaultParName`,  
0);


3. Example for LeafEvaluator evaluator:
```
  public static class IsNotNull extends LeafEvaluator {
    private static final long serialVersionUID = 1L;

    public static IsNotNull getInstance() {
      return new IsNotNull();
    }

    @Override
    public boolean eval(Map<String, ColumnStats> columnStatsMap) {
      ColumnStats columnStats = getColumnStats(columnStatsMap);
      // should consider FLOAT/DOUBLE & NAN
      return columnStats.getMinVal() != null || columnStats.getNullCnt() <= 0;
    }
  }
```

if part filed value is null ,

apply 2 to 3 , the `eval` method return true

apply 1 to 3 , the `eval` method return false. 

 

 


> Comparison between defaultParName and partValue
> -----------------------------------------------
>
>                 Key: HUDI-7169
>                 URL: https://issues.apache.org/jira/browse/HUDI-7169
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: hehuiyuan
>            Priority: Major
>
> If part filed value is null ,  use `defaultParName` instead of when writing 
> data.  
>  
> When reading:
> 1. If partVal  equals `defaultParName`,   the true value of part field is 
> null and the code return null.
> ```
> Object partVal = partStrArray[idx].equals(defaultParName)
>             ? null : DataTypeUtils.resolvePartition(partStrArray[idx], 
> partitionTypes.get(idx));
> ```
> ColumnStats columnStats = new ColumnStats(null, null, 1);
>  
> 2. if parVal  equals`defaultParName` and parKey is not equal to 
> `defaultParName`, the code  return `defaultParName`
> ```
> Object partVal = partKey.equals(defaultParName)
>             ? null : DataTypeUtils.resolvePartition(partStrArray[idx], 
> partitionTypes.get(idx));
> ```
> ColumnStats columnStats = new ColumnStats(`defaultParName`,`defaultParName`,  
> 0);
>  
> 3. Example for IsNotNullevaluator:
> ```
>   public static class IsNotNull extends LeafEvaluator {
>     private static final long serialVersionUID = 1L;
>     public static IsNotNull getInstance()
> {       return new IsNotNull();     }
>     @Override
>     public boolean eval(Map<String, ColumnStats> columnStatsMap)
> {       ColumnStats columnStats = getColumnStats(columnStatsMap);       // 
> should consider FLOAT/DOUBLE & NAN       return columnStats.getMinVal() != 
> null || columnStats.getNullCnt() <= 0;     }
>   }
> ```
> if part filed value is null ,
> apply 2 to 3 , the `eval` method return true
> apply 1 to 3 , the `eval` method return false. 
>  
> 4. For example  IsNull Evaluator
> ```
> public static class IsNull extends LeafEvaluator {
> private static final long serialVersionUID = 1L;
> public static IsNull getInstance() {
> return new IsNull();
> }
> @Override
> public boolean eval(Map<String, ColumnStats> columnStatsMap) {
> ColumnStats columnStats = getColumnStats(columnStatsMap);
> return columnStats.getNullCnt() > 0;
> }
> }
> ```
> if part filed value is null ,
> apply 1 to 4 , the `eval` method return true
> apply 2 to 4 , the `eval` method return false. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to