Hi Stuart,
2011/2/4 Stuart Lewis <[email protected]>
> Hi Timo,
>
> > I did some metadata gardening using the CSV bulk editor and as I uploaded
> my changes I noticed that the bulk metadata importer added an empty
> dc.description.abstract (with a 'fi' language identifier that is the default
> for our installation nowadays) into just about every record I had in that
> csv. (see: https://helda.helsinki.fi/handle/10138/24620?show=full for
> example). Mea culpa in that regard that I should have noticed this before
> pressing the yes key, but I still think there should be a some kind of a
> sanity check on the importer about not creating empty metadata value fields.
>
> There is a sanity check for this, so I'm not sure how it could have
> happened:
>
> -
> http://scm.dspace.org/trac/dspace/browser/dspace/trunk/dspace-api/src/main/java/org/dspace/app/bulkedit/DSpaceCSV.java#L481
>
> Is the field definitely empty - could it have contained something like a
> single space?
>
The field is definitely empty. I checked the csv for any spaces between
commas and there is none, also a SQL query
SELECT text_value, text_lang from metadatavalue where metadata_field_id=27
and text_value='' and text_lang='fi';
returns 392 rows, about the amount of affected Items. The same query with
one or more spaces in text_value returns zero.
Just a thought - the CSV contained Items orginating from different DSpace
instances that have had different default metadata value languages set at
different times (ie. there are abstracts written in finnish but marked with
'en' language flag) - now the default metadata value language is set to
finnish. Could it be possible that DSpace does some ill-advised assumptions
based on the default metadata language?
Meanwhile I will try to re-export, edit and re-import the affected Items if
that would get rid of the empties...
>
> It looks like Item.java, when ingesting metadata, performs a trim() on the
> values, to remove any leading or trailing spaces:
>
> -
> http://scm.dspace.org/trac/dspace/browser/dspace/trunk/dspace-api/src/main/java/org/dspace/content/Item.java#L684
>
> The CSV importer does not perform a trim(). I'll need to test this to see
> if it could be a problem, and if it does cause the problem you're seeing,
> then we'll need to make sure the CSV importer performs a trim too, so that
> it doesn't pass on a single space (or multiple spaces) to Item.java, which
> then trims them and archives them.
>
> Thanks,
>
>
> Stuart Lewis
> Digital Development Manager
> Te Tumu Herenga The University of Auckland Library
> Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
> Ph: +64 (0)9 373 7599 x81928
>
>
--
--
Timo Aalto
Planning Officer
University of Helsinki Library
timo dot j dot aalto at helsinki dot fi
------------------------------------------------------------------------------
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world?
http://p.sf.net/sfu/oracle-sfdevnlfb
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech