Drill is case preserving. In most cases, table names (and path) are case-sensitive (except probably on Windows). The case sensitivity / in-sensitivity of columns depends on the data source. For example, with DFS formats such as parquet and JSON, column names are case insensitive. But with MapR-DB (and probably HBase) they are case sensitive. However, we do not do any kind of validation.
So in the case of DFS, "employee" and "Employee" column names across records are considered the same and filtering on any case variant of the column name "employee" would return all matching results. But in the case of MapR-DB, we'll end up with only rows which match the exact case of the column. This is a bit concerning, as there is a potential for wrong results, if the user is not careful with the case. Even more concerning when it goes unnoticed as there is no validation of any kind (I ran into this when I accidentally formatted my queries to upper case). On Mon, Jan 29, 2018 at 5:36 PM, Timothy Farkas <[email protected]> wrote: > What is Drill's policy on case sensitivity? Some of the tests assume that > Drill is case-insensitive, but how does Drill handle data sources like > HBase that are case sensitive? Do we do some validation that there are no > potentially conflicting column names like "employee" and "Employee"? Do we > run and hope for the best? Or do we do something more advanced to handle > these cases? > > Thanks, > Tim >
