Hi Richard/Shawna, et al,

As promised (and only slightly late), I've updated my patch for
diacritics / sort order handling:

<http://sourceforge.net/tracker/index.php?func=detail&aid=1672065&group_id=19984&atid=319984>

This new version is quite an overhaul from where I left the initial
patch on Friday. First thing to note is that all sort order handling has
been extended to cover titles and subjects as well as authors.

Second thing, it's quite configurable ;-). Although the PluginManager
doesn't currently allow for named sequences, so if you want to configure
it you need to slightly modify the code, but it's a very minor edit.

The patch now defines a set of filter classes in
org.dspace.browse.order, all of which implement a standard interface,
allowing them to be chained together to generate the ordering string.

The filter chains are currently defined in
org.dspace.browse.BrowseOrdering - there are three arrays that can be
changed for filters that you want to use.

As standard, the list of filters defined almost exactly replicates
existing DSpace behaviour - the only change is the addition of a filter
to decompose diacritics (this will stop the *really* funky ordering of
diacritics that we currently see).

Available filters, include:

org.dspace.browse.order.StripDiacritics
 - removes diacritics from the string. If you want to replace all
diacritics with english equivalents, chain this *after*
DeprecatedDiacritics.

org.dspace.browse.order.LoCInitialArticleWord
 - this is based off the Library of Congress standards for ignoring
definite and indefinite articles at the start of a title. It includes
more words than the standard English behaviour (d' and ye), and is
LANGUAGE SENSITIVE - so if you have an Icelandic title, and supply the
two-character ISO code for the encoding of the title in the metadata
('is'), then it will remove 'hin', 'hina', 'hinar', etc. from the start
of the title.

(see <http://www.loc.gov/marc/bibliographic/bdapp-e.html>)

org.dspace.browse.order.LocaleDependent
 -  does Locale based sorting. You need to specify the locale to use in
'webui.browse.sort.locale' (dspace.cfg) - it uses one locale for
everything (it can't really work any other way). You should specify this
instead of any diacritic filters if you want to have correct diacritic
ordering for a given Locale (ie. French).

NOTE - Locale based sorting requires the generation of a non-human
readable string. This means you won't be able to make sense of it in the
database, and can't really chain any filters after it.


IMPORTANT NOTES - none of the filters (ie. decomposing diacritics,
stipping diacritics, etc.) affect what you see displayed as text. It
only affects the internal sort columns.

Also, if you ever change the list of filters, you MUST re-initialize the
browse:

index-all
or
dsrun org.dspace.browse.InitializeBrowse

Err, I think that's it for now. Have fun ;-)

G

-- 
Graham Triggs
Technical Architect
Open Repository

Tel:   +44 (0)20 7631 9942
Skype: grahamtriggs

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to