Hi,

We test in our database and the results are fine. We think is good to share
in this list:

First we use a PHPPGADMIN to generate a tsv list of metadata we want to
improve quality:

SELECT FROM "public". "Metadatavalue" WHERE "metadata_field_id" = 'X'

For example: use the journal title in metadata dc.relation.ispartof whose
id 42. I replace the "metadata_field_id" = 'X' at the end of the
consultation "metadata_field_id" = '42 '

The next step is to download and run Google Refine:
https://code.google.com/p/google-refine/

He is a webservice that allows batch editing and optimal algorithms of
semantic large amounts of data. Use the tsv to create a new project (do not
forget to choose the UTF-8 character encoding)

Google Refine has two features that can greatly help in editing the
records. Text and Facet Cluster and Edit. The first facet of all results
and the second column has algorithms that create clusters of information it
estimated to be equivalent.

Use Google Refine to fix all the database records and then export the
results in tsv again.

Edit this file using Notepad + + and replace function. In what find, select
Regular expression:

^ ([^ \ T] *) \ t ([^ \ t] *) \ t ([^ \ t] *) \ t ([^ \ t] *) \ t ([^ \ t]
*) \ t ([^ \ t] *) \ t ([^ \ t] *) \ t ([^ \ t] *) $

and replace with:

update metadatavalue September TEXT_VALUE = '\ 4', text_lang = '\ 5' where
metadata_value_id = \ 1;

The result is that all the lines will be transformed into a SQL UPDATE,
changing values and text_lang and text_value when metadata_value_id find
the corresponding.

Save the file as SQL and apply this sql in Database

Sorry for bad english

Tiago R. M. Murakami
Librarian
Brazil
------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to