[ 
https://issues.apache.org/jira/browse/SPARK-41226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated SPARK-41226:
--------------------------------
    Description: 
I am creating this one for Desmond Cheong since he can't signup for an account 
because of 
[https://infra.apache.org/blog/jira-public-signup-disabled.html.|https://infra.apache.org/blog/jira-public-signup-disabled.html]
 
His description for this improvement:
The Spark type system currently supports multiple data types with the same 
physical representation in memory. For example {{DateType}} and 
{{YearMonthIntervalType}} are both implemented using {{{}IntegerType{}}}. 
Because of this, operations on data types often involve case matching where 
multiple data types match to the same effects.To simplify this case matching 
logic, we can introduce the notion of logical and physical data types where 
multiple logical data types can be implemented with the same physical data 
type, then perform case matching on physical data types.Some areas that can 
utilize this logical/physical type separation are:
 * {{SpecializedGettersReader}} in {{SpecializedGettersReader.java}}
 * {{copy}} in {{ColumnarBatchRow.java}} and {{ColumnarRow.java}}
 * {{getAccessor}} in {{InternalRow.scala}}
 * {{externalDataTypeFor}} in {{RowEncoder.scala}}
 * {{unsafeWriter}} in {{InterpretedUnsafeProjection.scala}}
 * {{getValue}} and {{javaType}} in {{CodeGenerator.scala}}
 * {{doValidate}}  in {{literals.scala}}

  was:
I am creating this one for Desmond Cheong since he can't signup for an account 
because of 
[https://infra.apache.org/blog/jira-public-signup-disabled.html.|https://infra.apache.org/blog/jira-public-signup-disabled.html]
 
His description for this improvement:
The Spark type system currently supports multiple data types with the same 
physical representation in memory. For example {{DateType}} and 
{{YearMonthIntervalType}} are both implemented using {{{}IntegerType{}}}. 
Because of this, operations on data types often involve case matching where 
multiple data types match to the same effects.To simplify this case matching 
logic, we can introduce the notion of logical and physical data types where 
multiple logical data types can be implemented with the same physical data 
type, then perform case matching on physical data types.Some areas that can 
utilize this logical/physical type separation are: * 
{{SpecializedGettersReader}} in {{SpecializedGettersReader.java}}
 * {{copy}} in {{ColumnarBatchRow.java}} and {{ColumnarRow.java}}
 * {{getAccessor}} in {{InternalRow.scala}}
 * {{externalDataTypeFor}} in {{RowEncoder.scala}}
 * {{unsafeWriter}} in {{InterpretedUnsafeProjection.scala}}
 * {{getValue}} and {{javaType}} in {{CodeGenerator.scala}}
 * {{doValidate}}  in {{literals.scala}}


> Refactor Spark types by introducing physical types
> --------------------------------------------------
>
>                 Key: SPARK-41226
>                 URL: https://issues.apache.org/jira/browse/SPARK-41226
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.4.0
>            Reporter: Gengliang Wang
>            Priority: Major
>
> I am creating this one for Desmond Cheong since he can't signup for an 
> account because of 
> [https://infra.apache.org/blog/jira-public-signup-disabled.html.|https://infra.apache.org/blog/jira-public-signup-disabled.html]
>  
> His description for this improvement:
> The Spark type system currently supports multiple data types with the same 
> physical representation in memory. For example {{DateType}} and 
> {{YearMonthIntervalType}} are both implemented using {{{}IntegerType{}}}. 
> Because of this, operations on data types often involve case matching where 
> multiple data types match to the same effects.To simplify this case matching 
> logic, we can introduce the notion of logical and physical data types where 
> multiple logical data types can be implemented with the same physical data 
> type, then perform case matching on physical data types.Some areas that can 
> utilize this logical/physical type separation are:
>  * {{SpecializedGettersReader}} in {{SpecializedGettersReader.java}}
>  * {{copy}} in {{ColumnarBatchRow.java}} and {{ColumnarRow.java}}
>  * {{getAccessor}} in {{InternalRow.scala}}
>  * {{externalDataTypeFor}} in {{RowEncoder.scala}}
>  * {{unsafeWriter}} in {{InterpretedUnsafeProjection.scala}}
>  * {{getValue}} and {{javaType}} in {{CodeGenerator.scala}}
>  * {{doValidate}}  in {{literals.scala}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to