serramatutu opened a new pull request, #3604:
URL: https://github.com/apache/arrow-adbc/pull/3604

   ## Motivation
   The `Type` metadata key has two limitations which stems from BigQuery's API:
   1. it says fields of type `ARRAY<T>` are just `T` with `Repeated=true`
   2. it says `STRUCT<...>` fields are simply `RECORD`, and erases any 
information about the inner fields.
   
   These limitations can cause problems when trying to parse the `Type` key or 
when using it verbating against the warehouse in a statement, e.g a `CREATE 
TABLE` statement or a `AS T` cast.
   
   ## Summary
   This PR adds a new `BIGQUERY:type` key that formats the original SQL string 
as specified by BigQuery.
   
   Most types remain unchanged as they come from `gobigquery`, and in those 
cases this key will contain the same value as `Type`.
   
   However, arrays and structs get transformed to match the richer type string.
   
   ## Testing
   I ran a `CREATE TABLE AS` query against BigQuery. Here's the result for 
fields of different types
   
   [1] Regular non-nested types are simply copied over from the value of `Type`
   <details>
   <summary>1</summary>
   <img width="331" height="1071" alt="image" 
src="https://github.com/user-attachments/assets/ccd2ce17-37d8-4630-bef5-a503ed450c2a";
 />
   </details>
   
   [2] An array of integers becomes `ARRAY<INTEGER>`, while `Type` remains 
`INTEGER`
   <details>
   <summary>2</summary>
   <img width="319" height="369" alt="image" 
src="https://github.com/user-attachments/assets/e588d7ac-c7ca-40fb-ab51-9795e566d240";
 />
   </details>
   
   [3] An array of structs becomes `ARRAY<STRUCT<...>>`
   <details>
   <summary>3</summary>
   <img width="551" height="816" alt="image" 
src="https://github.com/user-attachments/assets/bb946ebc-747a-4529-88a8-68636f94e44e";
 />
   </details>
   
   [4] A struct of arrays' inner types are `ARRAY<...>`
   <details>
   <summary>4</summary>
   <img width="610" height="922" alt="image" 
src="https://github.com/user-attachments/assets/932a3554-ea56-4b1f-8642-801ee91c4f63";
 />
   </details>
   
   [5] A deeply nested struct also have the correct inner types
   <details>
   <summary>5</summary>
   <img width="1327" height="1307" alt="image" 
src="https://github.com/user-attachments/assets/3185651b-8809-42b0-adc4-ec956eaf9e87";
 />
   </details>
   
   ## Related issues
   - https://github.com/apache/arrow-adbc/issues/3449


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to