Probably this sql command will do the job -
delete from metadatavalue where metadata_value_id not in (select
min(metadata_value_id) from metadatavalue T2 where
T2.item_id=metadatavalue.item_id AND
T2.metadata_field_id=metadatavalue.metadata_field_id AND
T2.text_value=metadatavalue.text_value);

However, I was trying it on a copy of the original table and it is taking
forever to execute. Let me know if anyone has a better suggestion.
Thanks,
Koushik

On Tue, Aug 3, 2010 at 12:14 PM, Koushik Banerjee <
[email protected]> wrote:

> Hello Allen,
> Thanks a lot for the guide. Here is what further investigation has
> unearthed ->
> 1> This duplicate metadata issue doesn't occur with new submissions. For
> this installation, the DB was copied from an earlier version. It is only
> with these old items that the duplicate metadata entries show up.
> 2> The problem is in the DB itself. In the metadatavalue table, for each
> item_id there are duplicate entries for each metadata. It looks something
> like this -
>
> >> select * from metadatavalue where item_id=1000;
>
>  metadata_value_id | item_id | metadata_field_id |
>              text_value                                       | text_lang |
> place | authority | confidence
>   988                  |    1000  |                      15 | 2001-12
>                                                                         |
>             |     1    |              |         -1
>  989                   |    1000  |                      26 | Rescission\r
>                                                                         | en
>          |     1    |              |         -1
> ....
> ....
> 5204                  |    1000  |                      15 | 2001-12
>                                                                         |
>             |     1    |              |         -1
> 7654                  |    1000  |                      26 | Rescission\r
>                                                                       | en
>        |     1    |              |         -1
> ...
> ...
>
> Do you have a suggestion exactly how this can be solved? I am not very
> comfortable with DBs and do not want to try something arbitrary and
> completely mess it up. Or may be if you can refer a particular resource on
> the net, going through which will help me write my own sql queries to get
> rid of this problem.
>
> Thanks again,
> Koushik
>
> On Thu, Jul 29, 2010 at 11:40 PM, Allen Lam <[email protected]>wrote:
>
>>  Hi Koushik,
>>
>> The immediate task is to find out why there are duplicate entries in the
>> db, and stop creating duplication again.
>>
>> I don't know your installation or customization history, so this is a wild
>> guess again. There could be some errors made in your item submission form.
>> Check the item submission xml under config. If it is not the cause of the
>> problem..., I don't know. Please look back what changes you've made to the
>> system recently.
>>
>> OK. The next step is to remove the duplicate entries in the database.
>> Either you delete it manually in the edit item page one-by-one, or you do it
>> in the database level, do some SQL queries and write some programs to
>> automate the de-dup and delete process. Luckily you only have to deal with
>> one db table named 'metadatavalue'.
>>
>> Hope this rough guide could kick start your rescue process.
>>
>>
>> Best,
>> Allen Lam.
>> HKU Scholars Hub Administrator, http://hub.hku.hk
>>
>>
>>
>> On 2010-07-30 3:46 AM, Koushik Banerjee wrote:
>>
>> Thanks Allen.
>> I checked the "edit item" page. That also has duplicate entries i.e. two
>> dc.contributor.author, two dc.date.issued fields etc which can be edited.
>> Does it mean that the database itself is wrong? In that case, how do I get
>> rid of this problem?
>>
>>  Thanks
>>
>> On Thu, Jul 29, 2010 at 1:30 PM, Allen Lam <[email protected]>wrote:
>>
>> Hi Koushik,
>>
>> First try to make sure there is no duplication in the database.
>> Go into the Edit item page to check.
>>
>> If the problem is not with the db, it may be caused by incorrect indexing.
>> There could be some errors in your config file. Without extra information
>> it is only a wild guess.
>> But remember that any changes to indexing config need a re-indexing
>> afterward to see the effect.
>>
>>
>> Best,
>> Allen Lam.
>> HKU Scholars Hub Administrator, http://hub.hku.hk
>>
>>
>>
>>  On 2010-07-30 1:19 AM, Koushik Banerjee wrote:
>>
>>  The same thing happens in both JSPUI and XMLUI. I guess that means the
>> problem is not at the UI layer but something deeper. The Browse or Handle
>> servlets?
>>
>> This dual-display happens while viewing a particular item i.e. url like
>> [host]/[dspace or xmlui]/handle/number1/number2
>> also the author names are displayed twice when i browse using titles with
>> an url [host]/[dspace or xmlui]/browse?type=title
>>
>>  Any help is much appreciated.
>>
>> On Tue, Jul 27, 2010 at 5:47 PM, Koushik Banerjee <
>> [email protected]> wrote:
>>
>> Hello, I am trying out a DSpace 1.6 installation on postgresql. When I
>> view an item, all metadata is displayed twice. It looks something like this
>> Title: This is the sample title
>>         This is the sample title
>> Authors: Author 1
>>              Author 1
>>              Author 2
>>              Author 2
>> Issue Date: 21-May-2008
>>                   21-May-2008
>>
>>  etc etc. Anyone any idea. At what level can this problem be?
>>
>>  Thanks.
>>
>>
>>
>> ------------------------------------------------------------------------------
>> The Palm PDK Hot Apps Program offers developers who use the
>> Plug-In Development Kit to bring their C/C++ apps to Palm for a share
>> of $1 Million in cash or HP Products. Visit us here for more details:
>> http://p.sf.net/sfu/dev2dev-palm
>>
>>
>> _______________________________________________
>> DSpace-tech mailing 
>> [email protected]https://lists.sourceforge.net/lists/listinfo/dspace-tech
>>
>>
>>
>> ------------------------------------------------------------------------------
>> The Palm PDK Hot Apps Program offers developers who use the
>> Plug-In Development Kit to bring their C/C++ apps to Palm for a share
>> of $1 Million in cash or HP Products. Visit us here for more details:
>> http://p.sf.net/sfu/dev2dev-palm
>> _______________________________________________
>> DSpace-tech mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> The Palm PDK Hot Apps Program offers developers who use the
>> Plug-In Development Kit to bring their C/C++ apps to Palm for a share
>> of $1 Million in cash or HP Products. Visit us here for more details:
>> http://p.sf.net/sfu/dev2dev-palm
>> _______________________________________________
>> DSpace-tech mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>>
>>
>
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to