So is there a way to create such case, whitespace and punctuation insensitive index? I tried using http://marklogic.com/collation/S1 as Geert suggested and that worked as far as capitalization goes (but it does not work e.g. for whitespace – ‘Houston ‘ would still be a miss).
In addition, I had to specify the S1 collation at the app server level - ideally, I would want it to be at the field/index level. At any rate, handling capitalization/whitespace is only a small part of the story, other use cases include matching on a substring as in substring( ZipCode, 1, 3 ) = “921” (would wildcard search be faster than substring?) or a phonetic match as in spell:double-metaphone( substring-before( Email, '@' ) )[1] = “aptn” So for an ultimate solution, I am now looking at CPF/triggers as Justin suggested… Thanks! [Forward Slash] [Elevate] Alexei Betin Principal Architect; Big Data P: (817) 928-1643 | Elevate.com<http://www.elevate.com> 4150 International Plaza, Suite 300 Fort Worth, TX 76109 Privileged and Confidential. This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain privileged and/or confidential information. If you have received this e-mail in error, please notify me immediately by a return e-mail and delete this e-mail. You are hereby notified that any dissemination, distribution or copying of this e-mail and/or any attachments thereto, is strictly prohibited. From: [email protected] [mailto:[email protected]] On Behalf Of Paul Hoehne Sent: Thursday, January 08, 2015 5:04 PM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] function index in MarkLogic? So one reason to do this, in the example given, is that (hypothetically) a city field in an address might have: HOUSTON, houston, Houston, HOuston, etc. Rather than using a computed field, you would use a case insensitive, maybe whitespace and punctuation insensitive index, so NEW YORK NEW YORK New YORK New York new york etc, would all be considered ‘equal.’ in terms of the index. While this doesn’t cover every instance where you might want a function index, it handles those cases where you’re worried about the case of the text, punctuation, or spaces. On Jan 8, 2015, at 2:47 PM, Alexei Betin <[email protected]<mailto:[email protected]>> wrote: Hello, Does MarkLogic has something similar to “function index” as exists in other databases such as Oracle? For example, in Oracle one can create a function index on a value of UPPER(<field_name>) as follows: CREATE INDEX cities_fn_idx ON cities (UPPER(name)); which ensures that a query like the one below is efficient and not using the full table scan: SELECT name FROM cities WHERE UPPER(name) = 'HOUSTON'; I’d like to do something similar with MarkLogic which could help speed up, e.g. the following simple query (which currently runs prohibitively slow despite an index on /A/City): collection()/A/[upper-case( City ) = 'HOUSTON'] upper-case() is only an example, I am looking for a solution that would accommodate any function/transformation that derives a value to be used in a query condition from an existing element. Clearly, one solution would be to pre-compute the derived field, add it to all the documents in the collection, and create an index on the derived field – this is possible, but I want to make sure I am not missing a solution that does not require data modification (such as a “function index”). Thanks, <image005.jpg> <image006.jpg> Alexei Betin Principal Architect; Big Data P: (817) 928-1643 | Elevate.com<http://www.elevate.com/> 4150 International Plaza, Suite 300 Fort Worth, TX 76109 Privileged and Confidential. This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain privileged and/or confidential information. If you have received this e-mail in error, please notify me immediately by a return e-mail and delete this e-mail. You are hereby notified that any dissemination, distribution or copying of this e-mail and/or any attachments thereto, is strictly prohibited. _______________________________________________ General mailing list [email protected]<mailto:[email protected]> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
