Hi Samuel,

That’s certainly a possibility we’ve talked about, and for sure it would 
improve the performance vs. my solution. The tradeoffs to consider are that we 
have many columns across many tables where this is used, with 100’s of millions 
of records, and so each new index is a huge chunk of storage space. I think in 
the end, though, we should probably do as you suggest, at least for a selected 
subset of those columns.

As a general case of just optimizing the qualifier though, having the 4 index 
lookups, with the merge, is in practice much faster than one full table scan, 
so it seems like a useful tool to have.

Regards,
Mark

On Feb 25, 2016, at 10:16 AM, Samuel Pelletier 
<[email protected]<mailto:[email protected]>> wrote:

Hi Mark,

I suggest you look into adding a collation to your column or index. I do not 
know how Oracle does this but it surely exists.

With an index using a collation, your query is trivial and much more efficient 
because the index is already case insensitive, there is no OR.

A query like the one you build generate multiple index lookup, one for each 
part of the OR and the server need to merge all those results before iterating 
into the rows, compute the UPPER for each rows and compare. Imagine the speedup 
if a single index lookup can be used, I would expect at least 10X faster on 
large dataset... probably 100X if multiple OR parts returns 1000s rows.

Regards,

Samuel

Le 25 févr. 2016 à 10:16, Morris, Mark 
<[email protected]<mailto:[email protected]>> a écrit :

Hi Lon,

We run into this a lot, and I have a rather complex solution for a “case 
insensitive begins with” qualifier that actually works well, at least with 
Oracle and its indexing. My method returning this qualifier is actually about 
60 lines, but most of that is optimizations for special cases. (We have 
terabytes of data, so every little bit helps!)

The basic idea, though, is that it only takes a couple of characters to hit on 
an index and make a huge difference in performance. So…

Trying to case-insensitive-equals match 
"[email protected]<mailto:[email protected]>” with the search term 
“[email protected]<mailto:[email protected]>”, you would build this sort 
of qualifier:

(emailAddress like ‘PA*’ OR emailAddress like ‘Pa*’ OR emailAddress like ‘pA*’ 
OR emailAddress like ‘pa*’) AND UPPER(emailAddress) = 
UPPER('[email protected]<mailto:[email protected]>’)

Hope this helps!

Regards,
Mark

On Feb 24, 2016, at 7:12 PM, Chuck Hill 
<[email protected]<mailto:[email protected]>> wrote:

If you always want the comparison to be case insensitive, then you should be 
able to specify a database collation  for the column that is insensitive.   
Then just just a regular equals qualifier.  If you want it case sensitive 
sometimes and insensitive others, that is a different problem.

Chuck

From: 
<[email protected]<mailto:[email protected]>>
 on behalf of Lon Varscsak 
<[email protected]<mailto:[email protected]>>
Date: Wednesday, February 24, 2016 at 5:08 PM
To: Paul Hoadley <[email protected]<mailto:[email protected]>>
Cc: WebObjects Development 
<[email protected]<mailto:[email protected]>>
Subject: Re: Use case for a "case insensitive equals" qualifier

Hmm, it would be interesting if there was an alternative, but really the 
problem lies in the database.  Any wildcard type searches (at least that start 
with a wildcard) are not likely to use indexes and cause a table scan.  If your 
database supported the concept of having case-insensitive indexes, then I would 
think it would be pretty trivial to implement in your own qualifier.  In Sybase 
(the database I use mostly) you can create a functional index (an index based 
on a function), but it’s the equivalent creating a second column that’s always 
lower and maintaining an index on it.  The difference being that the database 
maintains it for you (which is usually annoying for EOF).

I’ve always wanted a better way to do this too.

-Lon

On Wed, Feb 24, 2016 at 5:13 PM, Paul Hoadley 
<[email protected]<mailto:[email protected]>> wrote:
Hello,

Say you have a web application where the login identifier is the user’s email 
address. This works in the conventional way: the user supplies that address at 
sign-up, and it serves two in-app functions: login identifier, and actual email 
address to which notifications can be sent. This is a fairly common pattern 
among some large, modern web apps.

It turns out that not everyone understands case sensitivity. We are seeing 
login failures in the wild because a user that signed up as 
“[email protected]<mailto:[email protected]>” is now trying to log in 
with “[email protected]<mailto:[email protected]>”, or vice versa. Here 
are some facts:

1. It would seem to be at least reasonably common for modern web apps that use 
email addresses as login identifiers to ignore case at login time. (For 
example, I tested a couple I had open in browser tabs: Strava and Bitbucket 
ignore case.)

2. Although the domain part of an email address is case-insensitive, my 
understanding is that the relevant RFCs suggest that you shouldn’t make 
assumptions about the local part. While everything I’ve read claims that in 
practice it will make no difference, let’s assume that we need to preserve the 
address as entered at sign-up. (It’s fail-safe to do so, whether we strictly 
need to or not.)

So, 1 is our aim: ignore case on the login identifier at login time. But 
because of 2, we don’t want to, say, normalise the email address given at 
sign-up to lower case and just store that, on the off chance that it makes a 
difference for mail delivery for that particular user. (Again, it probably 
won’t, but let’s assume that it could for the exercise.)

What are our options for finding the right User entity at login time?

1. We can jump in and naively use a CaseInsensitiveLike qualifier, but then a 
user can stick ‘?’ and ‘*’ wildcards in the input. We could strip those out, 
but they’re actually both valid characters in the local part. I stopped short 
of trying to escape them, as this route is starting to seem a little dangerous.

2. We could track both the supplied and a lower-cased version of the identifier 
in separate attributes. This has the advantage of presumably working, but it’s 
awkward, requiring special attention to changing the normalised attribute when 
the user-supplied one changes.

Can anyone suggest a better way? What I really need is a CaseInsensitiveEquals 
qualifier, like Java’s equalsIgnoreCase(). Is there such a thing? Would it be 
easily implemented?


--
Paul Hoadley
http://logicsquad.net/<https://urldefense.proofpoint.com/v2/url?u=http-3A__logicsquad.net_&d=CwMGaQ&c=BNNF-YNv0CLLslhP2Bcx5Q&r=R0ZqsewJs3eSJk7vLCqZv0r5kJlLXQLnGTeg9t8MlqA&m=7KLqWZrwsKBkwT1MhLMXnrodOnaySMsD2CRgkNTkB3E&s=9GkT3pIMTpVcb-YdxdzUW-JxrBfge-18-sg9YGXVa5c&e=>




 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      
([email protected]<mailto:[email protected]>)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/lon.varscsak%40gmail.com<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apple.com_mailman_options_webobjects-2Ddev_lon.varscsak-2540gmail.com&d=CwMGaQ&c=BNNF-YNv0CLLslhP2Bcx5Q&r=R0ZqsewJs3eSJk7vLCqZv0r5kJlLXQLnGTeg9t8MlqA&m=7KLqWZrwsKBkwT1MhLMXnrodOnaySMsD2CRgkNTkB3E&s=4t8j4KbqcLC7EqqbEB7D1ilbVGi4lCRxS6KGbYoQkl8&e=>

This email sent to [email protected]<mailto:[email protected]>

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      
([email protected]<mailto:[email protected]>)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/mark.morris%40experian.com<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apple.com_mailman_options_webobjects-2Ddev_mark.morris-2540experian.com&d=CwMFaQ&c=BNNF-YNv0CLLslhP2Bcx5Q&r=R0ZqsewJs3eSJk7vLCqZv0r5kJlLXQLnGTeg9t8MlqA&m=q5mXA72q_uaqTe7846sZ0c7odLKAoKBj7sR_AhluMdA&s=Gmi0RQgfJduQgn54kcMWy7hDRBSsXhEfW7-0B7Xi26M&e=>

This email sent to [email protected]<mailto:[email protected]>

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      
([email protected]<mailto:[email protected]>)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/samuel%40samkar.com<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apple.com_mailman_options_webobjects-2Ddev_samuel-2540samkar.com&d=CwMFaQ&c=BNNF-YNv0CLLslhP2Bcx5Q&r=R0ZqsewJs3eSJk7vLCqZv0r5kJlLXQLnGTeg9t8MlqA&m=q5mXA72q_uaqTe7846sZ0c7odLKAoKBj7sR_AhluMdA&s=tLmzLNBcFcUYu_AixRKsVsNg0kh4ThgOCvNoGRO-ZKo&e=>

This email sent to [email protected]<mailto:[email protected]>


 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      ([email protected])
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [email protected]

Reply via email to