I put up ticket #487 for this. theres a whole batch of sql-related
tickets im going to try to line up for a marathon session at some
future date in the next few weeks.
On Feb 21, 2007, at 10:25 AM, Troy wrote:
>
> Mike,
>
> Based on your excellent feedback and trying a couple different
> approaches I came up with the following. I have a diff file, but I'm
> new to contributing to open source so I'm not sure the best method to
> get this to you (so I'll copy paste below). The overview:
>
> * added visit_case_insensitive and overrode in PGCompiler to use
> "lower"
> * sub-classed ClauseList with _OrderByClause and _GroupByClause.
>
> This was done to add some context as to where in the statement the
> clauses belong, because we do not want to mess with the case of the
> results being returned, only when the character data is compared at
> the server. I need to do some more work to peg HAVING, but this works
> well otherwise.
>
> * added _CaseInsensitiveCompare like you mentioned.
> * modified _compare_self instead of __eq__, and added case_compare
> param to the String type
>
> I think this is better then added case_compare at the column level,
> because this only affects character data. Dates, Int's, etc do not
> have the same collation issues and the issues with Date that do exist
> are already addressed.
>
> Let me know if you need me to run in a different direction with this,
> or what I need to do differently to get this considered to be included
> in sqlalchemy.
>
> Troy
>
>
> ====== Diff File ======
>
> Index: sqlalchemy/sql.py
> ===================================================================
> --- sqlalchemy/sql.py (revision 2346)
> +++ sqlalchemy/sql.py (working copy)
> @@ -351,6 +351,7 @@
> def visit_clauselist(self, list):pass
> def visit_calculatedclause(self, calcclause):pass
> def visit_function(self, func):pass
> + def visit_case_insensitive(self, func):pass
> def visit_cast(self, cast):pass
> def visit_label(self, label):pass
> def visit_typeclause(self, typeclause):pass
> @@ -947,6 +948,17 @@
> else:
> return False
>
> +class _OrderByClause(ClauseList):
> + def append(self, clause):
> + if _is_literal(clause):
> + clause = _TextClause(str(clause))
> + elif isinstance(clause,ColumnElement) and not
> getattr(clause.type,'compare_case',True):
> + clause = _CaseInsensitiveCompare(clause)
> + self.clauses.append(clause)
> +
> +class _GroupByClause(_OrderByClause):
> + pass
> +
> class _CompoundClause(ClauseList):
> """represents a list of clauses joined by an operator, such as
> AND or OR.
> extends ClauseList to add the operator as well as a from_objects
> accessor to
> @@ -1037,6 +1049,16 @@
> c.accept_visitor(visitor)
> visitor.visit_function(self)
>
> +class _CaseInsensitiveCompare(ColumnElement):
> + def __init__(self, clause):
> + self.clause = clause
> + self.type = clause.type
> + def accept_visitor(self, visitor):
> + self.clause.accept_visitor(visitor)
> + visitor.visit_case_insensitive(self)
> + def _get_from_objects(self):
> + return self.clause._get_from_objects()
> +
> class _Cast(ColumnElement):
> def __init__(self, clause, totype, **kwargs):
> if not hasattr(clause, 'label'):
> @@ -1364,18 +1386,18 @@
> return True
> def order_by(self, *clauses):
> if len(clauses) == 1 and clauses[0] is None:
> - self.order_by_clause = ClauseList()
> + self.order_by_clause = _OrderByClause()
> elif getattr(self, 'order_by_clause', None):
> - self.order_by_clause =
> ClauseList(*(list(self.order_by_clause.clauses) + list(clauses)))
> + self.order_by_clause =
> _OrderByClause(*(list(self.order_by_clause.clauses) + list(clauses)))
> else:
> - self.order_by_clause = ClauseList(*clauses)
> + self.order_by_clause = _OrderByClause(*clauses)
> def group_by(self, *clauses):
> if len(clauses) == 1 and clauses[0] is None:
> - self.group_by_clause = ClauseList()
> + self.group_by_clause = _GroupByClause()
> elif getattr(self, 'group_by_clause', None):
> - self.group_by_clause = ClauseList(*(list(clauses)
> +list(self.group_by_clause.clauses)))
> + self.group_by_clause = _GroupByClause(*(list(clauses)
> +list(self.group_by_clause.clauses)))
> else:
> - self.group_by_clause = ClauseList(*clauses)
> + self.group_by_clause = _GroupByClause(*clauses)
> def select(self, whereclauses = None, **params):
> return select([self], whereclauses, **params)
> def _get_from_objects(self):
> Index: sqlalchemy/databases/postgres.py
> ===================================================================
> --- sqlalchemy/databases/postgres.py (revision 2346)
> +++ sqlalchemy/databases/postgres.py (working copy)
> @@ -463,7 +463,7 @@
>
> table.append_constraint(ForeignKeyConstraint(constrained_columns,
> refspec, conname))
>
> class PGCompiler(ansisql.ANSICompiler):
> -
> +
> def visit_insert_column(self, column, parameters):
> # all column primary key inserts must be explicitly present
> if column.primary_key:
> @@ -502,7 +502,11 @@
> if isinstance(binary.type, sqltypes.String) and
> binary.operator == '+':
> return '||'
> else:
> - return ansisql.ANSICompiler.binary_operator_string(self,
> binary)
> + return ansisql.ANSICompiler.binary_operator_string(self,
> binary)
> +
> + def visit_case_insensitive(self, object):
> + self.strings[object] = "lower(%s)" %
> self.strings[object.clause]
> +
>
> class PGSchemaGenerator(ansisql.ANSISchemaGenerator):
>
> Index: sqlalchemy/schema.py
> ===================================================================
> --- sqlalchemy/schema.py (revision 2346)
> +++ sqlalchemy/schema.py (working copy)
> @@ -497,6 +497,12 @@
> for constraint in self.constraints:
> constraint.accept_schema_visitor(visitor,
> traverse=True)
> visitor.visit_column(self)
> +
> + def _compare_self(self):
> + if isinstance(self.type,types.String) and not
> self.type.compare_case:
> + return sql._CaseInsensitiveCompare(self)
> + else:
> + return self
>
>
> class ForeignKey(SchemaItem):
> Index: sqlalchemy/types.py
> ===================================================================
> --- sqlalchemy/types.py (revision 2346)
> +++ sqlalchemy/types.py (working copy)
> @@ -169,11 +169,18 @@
> return super(String, cls).__new__(cls, *args, **kwargs)
> else:
> return super(String, TEXT).__new__(TEXT, *args, **kwargs)
> - def __init__(self, length = None):
> + def __init__(self, length = None, compare_case = False):
> + """Initializes new string instance
> +
> + compare_case - determines if case affects comparisons,
> sorting, etc
> + """
> self.length = length
> + self.compare_case = compare_case
> def adapt(self, impltype):
> return impltype(length=self.length)
> def convert_bind_param(self, value, dialect):
> + if not self.compare_case and hasattr(value,'lower'):
> + value = value.lower()
> if not dialect.convert_unicode or value is None or not
> isinstance(value, unicode):
> return value
> else:
> @@ -200,7 +207,7 @@
> return value.decode(dialect.encoding)
> else:
> return value
> -
> +
> class Integer(TypeEngine):
> """integer datatype"""
> def get_dbapi_type(self, dbapi):
> Index: sqlalchemy/ansisql.py
> ===================================================================
> --- sqlalchemy/ansisql.py (revision 2346)
> +++ sqlalchemy/ansisql.py (working copy)
> @@ -225,6 +225,10 @@
> else:
> self.strings[column] =
> self.preparer.format_column_with_table(column)
>
> + def visit_case_insensitive(self, object):
> + #override this in databases that default to case insesitive
> collation
> + self.strings[object] =
> self.strings[object.clause]
> +
> def visit_fromclause(self, fromclause):
> self.froms[fromclause] = fromclause.name
>
> @@ -257,7 +261,7 @@
> self.strings[compound] = "(" + s + ")"
> else:
> self.strings[compound] = s
> -
> +
> def visit_clauselist(self, list):
> if list.parens:
> self.strings[list] = "(" + string.join([s for s in
> [self.get_str(c) for c in list.clauses] if s is not None], ', ') + ")"
>
>
>
>
>
>
>
>
>
> On Feb 19, 12:17 pm, "Michael Bayer" <[EMAIL PROTECTED]> wrote:
>> On Feb 19, 2:32 pm, "Troy" <[EMAIL PROTECTED]> wrote:
>>
>>> Even though MySQL allows the setting of COLLATE, it does not support
>>> functional indexes, so if your code explicitly calls lower you
>>> technically now have code that will work for both MySQL and
>>> Postgres,
>>> but MySQL is going to take a big performance hit and perform table
>>> scans regardless of any created indexes. Realistically, I can't see
>>> anyone setting MySQL collate rules to case-sensitive without the
>>> support of functional indexes, which MySQL does not have.
>>
>> right...so some explicitness is required, somewhere (since SA cant
>> just put lower() across the board).
>>
>>
>>
>>> In simple terms, if the Postgres dialect supported
>>> compare_insensitive=True|False|[upper|lower]?
>>
>> on a Column-by-Column or even per-operation context is probably more
>> useful.
>>
>>> sqlalchemy easily support Postgres and MySQL with the same code
>>> in out-
>>> of-the-box configurations. But even if it didn't, if there was a
>>> way
>>> to override the default postgres dialect I'd be a happy camper.
>>> Infact, that is exactly what I have done. I added a dialect called
>>> "lowergres", but I'm stuck because I can not seem to find the
>>> appropriate hook to alter the column to a func.lower when the column
>>> is not part of the selected column list.
>>
>> format_column looks to me
>>
>>> like the right place to do it without converting a column to a
>>> function, but the column object in that function has no context
>>> as to
>>> where it lies in the sql statement.
>>
>> for this approach, id advise setting "state" on the compiler when you
>> are processing the column clause of a SELECT (or dealing with insert/
>> update column lists too), vs. when you are processing WHERE, ORDER
>> BY,
>> etc. that flag can be used to determine where youre traversing. the
>> compiler knows the context since its the one generating the
>> statement.
>>
>> a more accurate way to do is to wrap the Column itself, as it goes
>> into a comparison operation or ORDER BY, in a new construct such as
>> "CaseInsensitive"...heres some pseudo-ish code based on the __eq__()
>> method you see in sqlalchemy/sql.py _CompareMixin (assume "self" is a
>> Column):
>>
>> def __eq__(self, other):
>> return _BooleanExpression(CaseInsensitive(self), other, '==')
>>
>> Better yet its configurable on Column:
>>
>> def __eq__(self, other):
>> if self.case_insensitive_compare:
>> return _BooleanExpression(CaseInsensitive(self), other, '==')
>> else:
>> return _BooleanExpression(self, other, '==')
>>
>> CaseInsensitive looks a lot like _Function and is just:
>>
>> class CaseInsensitive(_CalculatedClause):
>> def __init__(self, target):
>> self.target = target
>> def accept_visitor(self, visitor):
>> self.target.accept_visitor(visitor)
>> visitor.visit_case_insensitive(self)
>>
>> ansicompiler provides the string representation of the underlying
>> target with no modification:
>>
>> def visit_case_insensitive(self, object):
>> self.strings[object] = self.strings[object.target]
>>
>> postgres compiler with case_insensitive provides it as:
>>
>> def visit_case_insensitive(self, object):
>> self.strings[object] = "lower(%s)" % self.strings[object.target]
>>
>> other dialects can have whatever flags to turn on/off the "lower()"
>> wrapping as well.
>>
>> what we're really talking about here is a func.lower() that has
>> special meaning to the compiler, i.e. that it should be conditionally
>> applied based on dialect flags. i think the flag on Column to have
>> the wrapper applied might be pretty slick.
>>
>>
>>
>>> I'm curious as to others experiences with writing an app that
>>> supports
>>> both Postgres and MySQL with sqlalchemy, because if someone else is
>>> doing this then I must be missing something, or maybe not. At
>>> first,
>>> our unit tests all passed, then when we added real world data with
>>> mixed case, tests started to fail on everything doing sorts and
>>> where's on character data.
>>
>> yeah i dont think anyone has gotten too far with this issue, also ive
>> no idea what the giant universe of Hibernate users do either (i think
>> there just arent db-platform-neutral J2EE apps).
>>
>>
>>
>>> How about a Pepsi (at PyCon)?
>>
>> sure thing !
>
>
> >
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---