Hi Karl, Thanks for your suggestions. I found a solution for this sorting requirement which works perfectly in ML 8
1. Create a Field for the element 2. Field range index with the following collation to specify that all characters are non-ignorable; that is, include all spaces and punctuation characters when sorting characters. http://marklogic.com/collation/en/S1/AN AN = Specifies that all characters are non-ignorable; that is, include all spaces and punctuation characters when sorting characters. Since AN is default in Marklogic 7 and 8 the above collation can also be accomplished using http://marklogic.com/collation/en/S1 <http://marklogic.com/collation/en/S1/AN> 3. Defined Field Tokenizer Overrides one each, class set to "remove" for the punctuation characters The above made the sorting works perfectly as desired dø we go? Do we go home? Does it count? Thanks, Blessing. On Thu, Mar 31, 2016 at 6:09 PM, Karl Bae <[email protected]> wrote: > Hi Blessing, > > This is very interesting topic! > Just for an idea, can the input string be manipulated before sorting? So > can the space be replaced with another character just before sorting, a > character which precedes in ascii order than the alphabetic characters? > > For example, > manipulate the input string not to have any symbols or special characters > first. Then replace space with ‘%’, in an assumption that ‘%’ comes before > the letter ‘A’ > Now you can sort ‘do%we%go?’ before ‘Does%it%count?’ > > Hope that helps, > Karl > Hi Team, > > I have custom sorting requirement where the following rules apply > > 1.Alphabetize letter by letter from A to Z. > 2.Ignore the capitalization of letters > 3.Ignore mathematical symbols and any special characters that do not > include a Latin letter > 4.Ignore punctuation > 5. Do not ignore spaces. Spaces follow the rule of "nothing precedes > something" > For example the title "Do we go?" comes before the title "Does it count?", > because the space after "Do" comes before any letter attached to the end of > this string (in this case, the "es" in "Does"). > > I managed to achieve all except #5 by creating a field with the following > collation(I'm using search:search) > > http://marklogic.com/collation/en/S1/AS/T0000/NO > > > My sort results below works fine except for 'Does it count?' coming on top > of 'dø we go?' > > > Æ and Words That Look Fancy > > ;a next sentence > > A Test Paper > > Beautiful Test Paper > > Does it count? > > dø we go? > > Do we go home? > > New Paper > > -Ology and -Osophy: Make Your Major > > 'Soda' Vs 'Pop': Regional Differences > > The First Paper of the Bunch > > ÜBer and Cyberhailing > > > > Is there a collation where I could define to not ignore spaces while > ignoring punctuation and symbols? > > > Thanks, > > Blessing. > > > > _______________________________________________ > General mailing list > [email protected] > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > > > _______________________________________________ > General mailing list > [email protected] > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > >
_______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
