We removed some empty metadata fields today. Whilst not a recommended approach, 
it's pretty easy. If you look at the logs, DSpace itself does the same thing.

Assuming metadata_field_id is 12345:

delete from metadatavalue where metadata_field_id='12345';
delete from metadatafieldregistry where metadata_field_id='12345';

Not near the DB at the moment but I believe those are the correct table names 
and field names. You need to get rid of the record from both tables.

A function should be added to the UI that does this since all it gives is an 
error saying that the field is in use. A further prompt could say 'Yep, I know, 
delete it anyway ;)'

cheers,
Steve


On 07/02/2011, at 7:07 PM, Timo Aalto wrote:

> Hi Stuart,
> 
> 2011/2/4 Stuart Lewis <s.le...@auckland.ac.nz>
> Hi Timo,
> 
> > I did some metadata gardening using the CSV bulk editor and as I uploaded 
> > my changes I noticed that the bulk metadata importer added an empty 
> > dc.description.abstract (with a 'fi' language identifier that is the 
> > default for our installation nowadays) into just about every record I had 
> > in that csv. (see: https://helda.helsinki.fi/handle/10138/24620?show=full 
> > for example). Mea culpa in that regard that I should have noticed this 
> > before pressing the yes key, but I still think there should be a some kind 
> > of a sanity check on the importer about not creating empty metadata value 
> > fields.
> 
> There is a sanity check for this, so I'm not sure how it could have happened:
> 
>  - 
> http://scm.dspace.org/trac/dspace/browser/dspace/trunk/dspace-api/src/main/java/org/dspace/app/bulkedit/DSpaceCSV.java#L481
> 
> Is the field definitely empty - could it have contained something like a 
> single space?
> The field is definitely empty. I checked the csv for any spaces between 
> commas and there is none, also a SQL query 
> 
>  SELECT text_value, text_lang from metadatavalue where metadata_field_id=27 
> and text_value='' and text_lang='fi';
> 
> returns 392 rows, about the amount of affected Items. The same query with one 
> or more spaces in text_value returns zero.
> 
> Just a thought - the CSV contained Items orginating from different DSpace 
> instances that have had different default metadata value languages set at 
> different times (ie. there are abstracts written in finnish but marked with 
> 'en' language flag) - now the default metadata value language is set to 
> finnish. Could it be possible that DSpace does some ill-advised assumptions 
> based on the default metadata language?
> 
> Meanwhile I will try to re-export, edit and re-import the affected Items if 
> that would get rid of the empties...
>  
> 
> It looks like Item.java, when ingesting metadata, performs a trim() on the 
> values, to remove any leading or trailing spaces:
> 
>  - 
> http://scm.dspace.org/trac/dspace/browser/dspace/trunk/dspace-api/src/main/java/org/dspace/content/Item.java#L684
> 
> The CSV importer does not perform a trim().  I'll need to test this to see if 
> it could be a problem, and if it does cause the problem you're seeing, then 
> we'll need to make sure the CSV importer performs a trim too, so that it 
> doesn't pass on a single space (or multiple spaces) to Item.java, which then 
> trims them and archives them.
> 
> Thanks,
> 
> 
> Stuart Lewis
> Digital Development Manager
> Te Tumu Herenga The University of Auckland Library
> Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
> Ph: +64 (0)9 373 7599 x81928
> 
> 
> 
> 
> -- 
> -- 
> Timo Aalto
> Planning Officer
> University of Helsinki Library
> timo dot j dot aalto at helsinki dot fi
> ------------------------------------------------------------------------------
> The modern datacenter depends on network connectivity to access resources
> and provide services. The best practices for maximizing a physical server's
> connectivity to a physical network are well understood - see how these
> rules translate into the virtual world? 
> http://p.sf.net/sfu/oracle-sfdevnlfb_______________________________________________
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech

------------------------------------------------------------------------------
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world? 
http://p.sf.net/sfu/oracle-sfdevnlfb
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to