Hi,

Yes, that was a very early iteration of the ordering patch, so you may
have run into problems. That said, I thought I tested that particular
piece of funtionality - it's been a while since I looked at the 1.4.x
browse code (been too immersed in the new 1.5+ system), but it should
have been normalising the incoming data somewhere anyway.

Note that the lookup occurs against sort_author because that column is
indexed - looking up the value in author would hurt the scalability of
the code - either horribly in the case of an unindexed lookup, or a bit
if you chose to additionally index the author column.

You may have been better served applying the patch from:
http://sourceforge.net/tracker/index.php?func=detail&aid=1672065&group_id=
19984&atid=319984

which is closer to the functionality in 1.5 - although even that has had
some changes in the latest iteration. Now, all the normalisation code
has been split out of the browse package, and has been tied in to
generating new fields in the Lucene index. So, all the controls that
have been demonstrated for sorting the browse lists are also available
on search results - and the options that you have configured are
synchronized between the two parts of the interface (although when you
upgrade to 1.5 or change your sort options, you will need to recreate
both the browse tables and lucene indexes).

G

Ron Stevenhaagen wrote:
>
> _Normalize for Diacritics in Browse Author to correct sort order_
>
> I have applied the suggested fix to normalize the sort when browsing
> authors as found here:
>
>
http://www.mail-archive.com/[email protected]/msg00570.htm
l
>
> The ‘Authors’ now sort correctly as a result of this fix however when
> clicking on an author with diacritics the documents submitted by that
> author are not displayed.
>
> The diacritic form of the author’s name is submitted through the GET
> METHOD which packages the information into the URL and calls
> ItemsByAuthorServlet.java. By tracing through the code we discovered a
> comparison is made to the “sort_author” field in table
> “itemsbyauthor”. Because we are comparing the diacritic form of the
> author provided by the GET METHOD to the normalized author from
> “sort_author”, no documents are retrieved. Should we not be doing a
> comparison to the “author” field from table “itemsbyauthor” which
> holds the author’s name in the diacritic form?
>
> For the time being, to make the link to the documents we modified the
> code in ItemsByAuthorServlet.java to normalize the author submitted by
> the GET METHOD. This creates some extra code which does not seem as
> clean as accessing the author field directly.
>
> Here is the patch I put together to work around this for the time being:
>
> --- ItemsByAuthorServlet.java (original)
>
> +++ ItemsByAuthorServlet.java (modified)
>
> @@ -58,6 +58,10 @@
>
> import org.dspace.core.Context;
>
> import org.dspace.core.LogManager;
>
> +// For normalizing author
>
> +import com.ibm.icu.text.Normalizer;
>
> +
>
> /**
>
> * Displays the items with a particular author.
>
> *
>
> @@ -118,6 +121,10 @@
>
> author = "";
>
> }
>
> + // Normalize author
>
> + author = Normalizer.normalize(author.toLowerCase(), Normalizer.NFD)
>
> + .replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
>
> +
>
> // Do the browse
>
> ------------------------------------------------------------------------
>
> ------------------------------------------------------------------------
-
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> ------------------------------------------------------------------------
>
> _______________________________________________
> DSpace-tech mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>


 
 
This e-mail is confidential and should not be used by anyone who is not
the original intended recipient. BioMed Central Limited does not accept
liability for any statements made which are clearly the sender's own and
not expressly made on behalf of BioMed Central Limited. No contracts may
be concluded on behalf of BioMed Central Limited by means of e-mail
communication. BioMed Central Limited Registered in England and Wales with
registered number 3680030 Registered Office Middlesex House, 34-42
Cleveland Street, London W1T 4LB
This email has been scanned by Postini.
For more information please visit http://www.postini.com


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to