I've come up with a temporary solution that works (at least it does today)
on the items that we (currently) have.  In the Dspace source, I modified our
org.dspace.browse.Browse so that the getTargetColumns method contains these
lines:

 

 

        else if (browseType == ITEMS_BY_AUTHOR_BROWSE)

            return "distinct item_id, sort_author";

 

 

This gave me the correct number of results, but it seems to break the sort
by titles feature of this browse.   So in the Manakin source, I modified our
BrowseAuthorItems class in the ArtifactBrowser aspect so that the addBody
method sorts the results according to the dc.title element:

                                                                      

 

        Item[] items = browseInfo.getItemResults();

        Arrays.sort(items, new ItemComparator("title", null, Item.ANY,
true));

 

 

We're also checking with our library liaison if it would be acceptable to
remove the dc.contributor element from the metadata when the value is
identical to the value in dc.contributor.author.  There are concerns about
both of these approaches, so I'm still open to other ideas about how to
handle the situation.

 

I've also received a patch from Christophe Dupriez that fixes issues with
duplicate items in the jsp interface.  I can forward it along to anyone who
would like to try this route.

 

Keith Gilbertson

Systems Developer

Ohio Library and Information Network

 

  _____  

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Keith
Gilbertson
Sent: Thursday, November 08, 2007 10:46 AM
To: dspace-tech@lists.sourceforge.net
Subject: [Dspace-tech] Duplicate items in browse items by author

 

 

Hello,

 

I'm working on troubleshooting an issue with an installation of  DSpace
1.4.2 and Manakin 1.1.  When browsing items by certain authors, the items
appear twice in the artifact list.  An example can be seen here:

 

http://drc.libraries.wright.edu/browse-author-items?author=The+Dayton-Wright
+Airplane+Company

 

The items by this author were added to the collection via the DSpace
ItemImport tool, but this is also occurring for items that were submitted
manually by users through the Manakin web interface.

 

When I examine the full item records for these items that are being listed
twice in the items by author browse, I see information similar to the
following:

 

contributor:              The Dayton-Wright Airplane Company en_US 

contributor.author:       The Dayton-Wright Airplane Company en_US

contributor.institution:  Wright State University

 

There are three contributor fields and two of them have the same value.
When I look in the itemsbyauthor table in the database, I see the following
for one of these items:

 

items_by_author_id | item_id |                author                |
sort_author              

--------------------+---------+--------------------------------------+------
--------------------------------

               4787 |     115 | The Dayton-Wright Airplane Company   | the
dayton-wright airplane company

               4788 |     115 | The Dayton-Wright Airplane Company   | the
dayton-wright airplane company

               4789 |     115 | Wright State University              |
wright state university

 

Each item_id appears three times, including two times with the same author -
once for the contributor field and once for the contributor.author field.

 

Has anyone dealt with items displaying multiple times in browse by author
views, and how did you handle it?  Are multiple occurrences of the same item
with the same author in the itemsbyauthor table allowed by design?

 

What would be the best way for us to fix this on our installation?  I've
collected some ideas but I'm unsure of all of the consequences.

 

   - Change the metadata for our items so that the unqualified contributor
element is not used.  Contributor.author may be sufficient.

 

   - Change the XSLT that creates the browse table to check if the current
item is a duplicate of the previous sibling before displaying it.  The
problem also exists with the JSP interface, but we use only the Manakin
interface.

   

-  Change the underlying database query for browsing items by author so that
only tuples with distinct item_id values are returned

 

-  Change the item submission tools so that the author/item_id combination
is not duplicated between rows in the itemsbyauthor table

 

 

guidance on a solution.

 

 

 

 

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to