We removed some empty metadata fields today. Whilst not a recommended approach,
it's pretty easy. If you look at the logs, DSpace itself does the same thing.
Assuming metadata_field_id is 12345:
delete from metadatavalue where metadata_field_id='12345';
delete from metadatafieldregistry where metadata_field_id='12345';
Not near the DB at the moment but I believe those are the correct table names
and field names. You need to get rid of the record from both tables.
A function should be added to the UI that does this since all it gives is an
error saying that the field is in use. A further prompt could say 'Yep, I know,
delete it anyway ;)'
cheers,
Steve
On 07/02/2011, at 7:07 PM, Timo Aalto wrote:
> Hi Stuart,
>
> 2011/2/4 Stuart Lewis <s.le...@auckland.ac.nz>
> Hi Timo,
>
> > I did some metadata gardening using the CSV bulk editor and as I uploaded
> > my changes I noticed that the bulk metadata importer added an empty
> > dc.description.abstract (with a 'fi' language identifier that is the
> > default for our installation nowadays) into just about every record I had
> > in that csv. (see: https://helda.helsinki.fi/handle/10138/24620?show=full
> > for example). Mea culpa in that regard that I should have noticed this
> > before pressing the yes key, but I still think there should be a some kind
> > of a sanity check on the importer about not creating empty metadata value
> > fields.
>
> There is a sanity check for this, so I'm not sure how it could have happened:
>
> -
> http://scm.dspace.org/trac/dspace/browser/dspace/trunk/dspace-api/src/main/java/org/dspace/app/bulkedit/DSpaceCSV.java#L481
>
> Is the field definitely empty - could it have contained something like a
> single space?
> The field is definitely empty. I checked the csv for any spaces between
> commas and there is none, also a SQL query
>
> SELECT text_value, text_lang from metadatavalue where metadata_field_id=27
> and text_value='' and text_lang='fi';
>
> returns 392 rows, about the amount of affected Items. The same query with one
> or more spaces in text_value returns zero.
>
> Just a thought - the CSV contained Items orginating from different DSpace
> instances that have had different default metadata value languages set at
> different times (ie. there are abstracts written in finnish but marked with
> 'en' language flag) - now the default metadata value language is set to
> finnish. Could it be possible that DSpace does some ill-advised assumptions
> based on the default metadata language?
>
> Meanwhile I will try to re-export, edit and re-import the affected Items if
> that would get rid of the empties...
>
>
> It looks like Item.java, when ingesting metadata, performs a trim() on the
> values, to remove any leading or trailing spaces:
>
> -
> http://scm.dspace.org/trac/dspace/browser/dspace/trunk/dspace-api/src/main/java/org/dspace/content/Item.java#L684
>
> The CSV importer does not perform a trim(). I'll need to test this to see if
> it could be a problem, and if it does cause the problem you're seeing, then
> we'll need to make sure the CSV importer performs a trim too, so that it
> doesn't pass on a single space (or multiple spaces) to Item.java, which then
> trims them and archives them.
>
> Thanks,
>
>
> Stuart Lewis
> Digital Development Manager
> Te Tumu Herenga The University of Auckland Library
> Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
> Ph: +64 (0)9 373 7599 x81928
>
>
>
>
> --
> --
> Timo Aalto
> Planning Officer
> University of Helsinki Library
> timo dot j dot aalto at helsinki dot fi
> ------------------------------------------------------------------------------
> The modern datacenter depends on network connectivity to access resources
> and provide services. The best practices for maximizing a physical server's
> connectivity to a physical network are well understood - see how these
> rules translate into the virtual world?
> http://p.sf.net/sfu/oracle-sfdevnlfb_______________________________________________
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
------------------------------------------------------------------------------
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world?
http://p.sf.net/sfu/oracle-sfdevnlfb
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech