So is there a way to create such case, whitespace and punctuation insensitive 
index? I tried using http://marklogic.com/collation/S1 as Geert suggested and 
that worked as far as capitalization goes (but it does not work e.g. for 
whitespace – ‘Houston ‘ would still be a miss).

In addition, I had to specify the S1 collation at the app server level - 
ideally, I would want it to be at the field/index level.

At any rate, handling capitalization/whitespace is only a small part of the 
story, other use cases include matching on a substring as in
substring( ZipCode, 1, 3 ) = “921”
(would wildcard search be faster than substring?) or a phonetic match as in
spell:double-metaphone( substring-before( Email, '@' )  )[1] = “aptn”

So for an ultimate solution, I am now looking at CPF/triggers as Justin 
suggested…

Thanks!

[Forward Slash]

[Elevate]

Alexei Betin

Principal Architect; Big Data
P: (817) 928-1643 | Elevate.com<http://www.elevate.com>
4150 International Plaza, Suite 300
Fort Worth, TX 76109


Privileged and Confidential. This e-mail, and any attachments thereto, is 
intended only for use by the addressee(s) named herein and may contain 
privileged and/or confidential information. If you have received this e-mail in 
error, please notify me immediately by a return e-mail and delete this e-mail. 
You are hereby notified that any dissemination, distribution or copying of this 
e-mail and/or any attachments thereto, is strictly prohibited.


From: [email protected] 
[mailto:[email protected]] On Behalf Of Paul Hoehne
Sent: Thursday, January 08, 2015 5:04 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] function index in MarkLogic?

So one reason to do this, in the example given, is that (hypothetically) a city 
field in an address might have:

HOUSTON, houston, Houston, HOuston, etc.

Rather than using a computed field, you would use a case insensitive, maybe 
whitespace and punctuation insensitive index, so
NEW YORK
NEW   YORK
New YORK
New York
new york
etc, would all be considered ‘equal.’ in terms of the index.

While this doesn’t cover every instance where you might want a function index, 
it handles those cases where you’re worried about the case of the text, 
punctuation, or spaces.


On Jan 8, 2015, at 2:47 PM, Alexei Betin 
<[email protected]<mailto:[email protected]>> wrote:

Hello,

Does MarkLogic has something similar to “function index” as exists in other 
databases such as Oracle? For example, in Oracle one can create a function 
index on a value of UPPER(<field_name>) as follows:

CREATE INDEX cities_fn_idx ON cities (UPPER(name));
which ensures that a query like the one below is efficient and not using the 
full table scan:
SELECT name FROM cities WHERE UPPER(name) = 'HOUSTON';

I’d like to do something similar with MarkLogic which could help speed up, e.g. 
the following simple query (which currently runs prohibitively slow despite an 
index on /A/City):

collection()/A/[upper-case( City ) = 'HOUSTON']

upper-case() is only an example, I am looking for a solution that would 
accommodate any function/transformation that derives a value to be used in a 
query condition from an existing element.

Clearly, one solution would be to pre-compute the derived field, add it to all 
the documents in the collection, and create an index on the derived field – 
this is possible, but I want to make sure I am not missing a solution that does 
not require data modification (such as a “function index”).

Thanks,

<image005.jpg>

<image006.jpg>

Alexei Betin

Principal Architect; Big Data
P: (817) 928-1643 | Elevate.com<http://www.elevate.com/>
4150 International Plaza, Suite 300
Fort Worth, TX 76109


Privileged and Confidential. This e-mail, and any attachments thereto, is 
intended only for use by the addressee(s) named herein and may contain 
privileged and/or confidential information. If you have received this e-mail in 
error, please notify me immediately by a return e-mail and delete this e-mail. 
You are hereby notified that any dissemination, distribution or copying of this 
e-mail and/or any attachments thereto, is strictly prohibited.


_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to