Github user jainaks commented on the issue:
https://github.com/apache/spark/pull/21320
Hi @mallman ,
I found another major issue after having this fix.
Schema:
a: struct (nullable = true)
| |-- b: struct (nullable = true)
| | |-- c1: string (nullable = true)
| | |-- c2: string (nullable = true)
| | |-- c3: string (nullable = true)
| | |-- c4: string (nullable = true)
| | |-- c5: boolean (nullable = true)
id: struct (nullable = true)
| |-- i1: struct (nullable = true)
| | |-- i2: string (nullable = true)
timestamp: bigint
**Query:**
select a.b.c3 as c3,
first(a.b.c3) over (partition by id.i1.i2 order by timestamp
rows between current row and unbounded following) as first_c3
from temp;
The column "first_c3" gets the value of column "c2".
It works well, if i just turn the parquetSchemaPrunning flag to false.
It may sound odd in the first look and so does for me, but this is what i
am getting.
PS: I am running all my tests using #16578 pr.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]