luofeng1994 opened a new issue, #5098: URL: https://github.com/apache/gravitino/issues/5098
### Version 0.6.0 ### Describe what's wrong When parsing metadata from a Doris database, fields with the data type array<varchar()> are incorrectly parsed as varchar() instead of retaining the array<> structure. This misclassification causes inaccuracies in the schema representation and can lead to issues in processing and query generation. here is the data type in the Doris DDL,which is array<varchar(500)>.  here is the data type parsed in Gravitino, which is incorrectly parsed as varchar(500).  ### Error message and/or stacktrace no error message ### How to reproduce version: 0.6.0-incubating reproduce: just create a table with array field in Doris,and see it in the Gravitino. ### Additional context Based on my investigation, the cause of this error lies in the deserialization process of data types. During deserialization (org.apache.gravitino.json.fromPrimitiveTypeString), strings that do not match PrimitiveType are evaluated through a regular expression to determine their Type. However, the regular expression for VARCHAR incorrectly matches the array<varchar()> data type, causing it to be misclassified as VARCHAR. This leads to the improper parsing of array<varchar(255)> as varchar(255). I believe the solution to this issue is to introduce a new Type.ArrayType data type and implement a corresponding regular expression to accurately match array<> data types. This way, array<> structures can be correctly identified during deserialization, avoiding misclassification as basic types like VARCHAR. This modification will ensure that the array<varchar(255)> type is properly parsed as an array.  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
