CurtHagenlocher commented on code in PR #1858:
URL: https://github.com/apache/arrow-adbc/pull/1858#discussion_r1599313578
##########
csharp/src/Drivers/Apache/Spark/SparkConnection.cs:
##########
@@ -481,8 +486,9 @@ private static IArrowType GetArrowType(ColumnTypeId
columnTypeId, string typeNam
case ColumnTypeId.CHAR_TYPE:
return StringType.Default;
case ColumnTypeId.DECIMAL_TYPE:
- // TODO: Parse typeName for precision and scale, because
not available in other metadata.
- return new Decimal128Type(38, 38);
+ // Note: parsing the type definition is only viable at the
table level. Won't
+ // work for statement results.
Review Comment:
What will happen for statement results? And does that code path go through
here?
##########
csharp/src/Drivers/Apache/Spark/SparkConnection.cs:
##########
@@ -688,9 +693,63 @@ private string PatternToRegEx(string? pattern)
return builder.ToString();
}
+
+ /// <summary>
+ /// Provides a parser for SQL DECIMAL type definitions.
+ /// </summary>
+ private static class SqlDecimalTypeParser
+ {
+ // Pattern is based on this definition
+ //
https://docs.databricks.com/en/sql/language-manual/data-types/decimal-value.html#syntax
+ // { DECIMAL | DEC | NUMERIC } [ ( p [ , s ] ) ]
+ private static readonly Regex s_expression = new(
+
@"^\s*(?<typeName>((DECIMAL)|(DEC)|(NUMERIC)))(\s*\(\s*((?<precision>\d+)(\s*\,\s*(?<scale>\d+))?)\s*\))?\s*$",
+ RegexOptions.IgnoreCase | RegexOptions.Compiled |
RegexOptions.CultureInvariant);
+
+ /// <summary>
+ /// Parses the input string for a valid SQL DECIMAL type
definition and returns a new <see cref="Decimal128Type"/> or returns the
<c>defaultValue</c>, if invalid.
+ /// </summary>
+ /// <param name="input">The SQL type defintion string to
parse.</param>
+ /// <param name="defaultValue">If input string is an invalid SQL
DECIMAL type definition, this value is returned instead.</param>
+ /// <returns>If input string is a valid SQL DECIMAL type
definition, it returns a new <see cref="Decimal128Type"/>; otherwise
<c>defaultValue</c>.</returns>
+ public static Decimal128Type ParseOrDefault(string input,
Decimal128Type defaultValue)
+ {
+ return TryParse(input, out Decimal128Type? candidate) ?
candidate! : defaultValue;
+ }
+
+ /// <summary>
+ /// Tries to parse the input string for a valid SQL DECIMAL type
definition.
+ /// </summary>
+ /// <param name="input">The SQL type defintion string to
parse.</param>
+ /// <param name="value">If successful, an new <see
cref="Decimal128Type"/> with the precision and scale set; otherwise
<c>null</c>.</param>
+ /// <returns>True if it can successfully parse the type definition
input string; otherwise false.</returns>
+ private static bool TryParse(string input, out Decimal128Type?
value)
+ {
+ // Ensure defaults are set, in case not provided in
precision/scale clause.
+ int precision = 10;
+ int scale = 0;
+
+ Match match = s_expression.Match(input);
+ if (!match.Success)
+ {
+ value = null;
+ return false;
+ }
+
+ GroupCollection groups = match.Groups;
+ Group precisionGroup = groups["precision"];
+ Group scaleGroup = groups["scale"];
+
+ precision = precisionGroup.Success ?
int.Parse(precisionGroup.Value) : precision;
+ scale = scaleGroup.Success ? int.Parse(scaleGroup.Value) :
scale;
Review Comment:
These should probably be TryParse; the capture might succeed but capture so
many digits that the Parse will throw an exception. Alternatively, the regex
could probably specify a range of lengths -- but TryParse seems like a safer
and more readable approach to me.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]