rdblue edited a comment on issue #24246: [SPARK-24252][SQL] Add TableCatalog API URL: https://github.com/apache/spark/pull/24246#issuecomment-489773795 @cloud-fan, I don't think sources need to be case insensitive. Spark should assume that sources are case sensitive and apply its case sensitivity setting. There are 2 main cases: table identifiers and column identifiers: * When identifying tables, Spark can only pass on the identifier provided by the user. It cannot control whether the underlying catalog is case sensitive or not. For example, if a catalog contains both `a.b` and `A.B` then it would be ambiguous for Spark to require case insensitive matching. Case sensitive catalogs would necessarily break Spark's assumption and return the one matching case. So there's no point when identifying tables. * When identifying columns, Spark knows the table schema that is reported and should apply its case sensitivity rules before calling `alterTable`. If Spark relied on the catalog to enforce case sensitivity settings (or table for scan projection), then implementations would inevitably get it wrong. Instead, Spark can resolve the name the user provided (e.g., `aBc`) against the table definition (e.g. `abc`) and pass an identifier that matches the table definition to ensure case sensitivity in the implementation doesn't matter. @gatorsmile, any additional comments on case sensitivity?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
