Hi Timo, Thanks for the update - it will be interesting to hear how your re-import goes. Since there are no spaces in the CSV (these would, I think, although I've not had time to confirm it, be translated into empty strings once they get into the database), I'm not sure what could cause this. Language field shouldn't be an issue, as one of the things the CSV has been used most for by people is to fix up incorrect language codes.
The only bug we've encountered (and fixed) recently, which looks like it has been in DSpace for many many previous versions, is a problem with schemas. If you had dc.decription.abstract and xyz.description.abstract, then asking DSpace for the values of one would get the values of both. But from your description I don't think this is the case for you. If you do find anything that causes it, please get in touch and we'll try and work out what is going on. Thanks, Stuart Lewis Digital Development Manager Te Tumu Herenga The University of Auckland Library Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand Ph: +64 (0)9 373 7599 x81928 On 7/02/2011, at 9:07 PM, Timo Aalto wrote: > Hi Stuart, > > 2011/2/4 Stuart Lewis <[email protected]> > Hi Timo, > > > I did some metadata gardening using the CSV bulk editor and as I uploaded > > my changes I noticed that the bulk metadata importer added an empty > > dc.description.abstract (with a 'fi' language identifier that is the > > default for our installation nowadays) into just about every record I had > > in that csv. (see: https://helda.helsinki.fi/handle/10138/24620?show=full > > for example). Mea culpa in that regard that I should have noticed this > > before pressing the yes key, but I still think there should be a some kind > > of a sanity check on the importer about not creating empty metadata value > > fields. > > There is a sanity check for this, so I'm not sure how it could have happened: > > - > http://scm.dspace.org/trac/dspace/browser/dspace/trunk/dspace-api/src/main/java/org/dspace/app/bulkedit/DSpaceCSV.java#L481 > > Is the field definitely empty - could it have contained something like a > single space? > The field is definitely empty. I checked the csv for any spaces between > commas and there is none, also a SQL query > > SELECT text_value, text_lang from metadatavalue where metadata_field_id=27 > and text_value='' and text_lang='fi'; > > returns 392 rows, about the amount of affected Items. The same query with one > or more spaces in text_value returns zero. > > Just a thought - the CSV contained Items orginating from different DSpace > instances that have had different default metadata value languages set at > different times (ie. there are abstracts written in finnish but marked with > 'en' language flag) - now the default metadata value language is set to > finnish. Could it be possible that DSpace does some ill-advised assumptions > based on the default metadata language? > > Meanwhile I will try to re-export, edit and re-import the affected Items if > that would get rid of the empties... > > > It looks like Item.java, when ingesting metadata, performs a trim() on the > values, to remove any leading or trailing spaces: > > - > http://scm.dspace.org/trac/dspace/browser/dspace/trunk/dspace-api/src/main/java/org/dspace/content/Item.java#L684 > > The CSV importer does not perform a trim(). I'll need to test this to see if > it could be a problem, and if it does cause the problem you're seeing, then > we'll need to make sure the CSV importer performs a trim too, so that it > doesn't pass on a single space (or multiple spaces) to Item.java, which then > trims them and archives them. > > Thanks, > > > Stuart Lewis > Digital Development Manager > Te Tumu Herenga The University of Auckland Library > Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand > Ph: +64 (0)9 373 7599 x81928 > > > > > -- > -- > Timo Aalto > Planning Officer > University of Helsinki Library > timo dot j dot aalto at helsinki dot fi ------------------------------------------------------------------------------ The modern datacenter depends on network connectivity to access resources and provide services. The best practices for maximizing a physical server's connectivity to a physical network are well understood - see how these rules translate into the virtual world? http://p.sf.net/sfu/oracle-sfdevnlfb _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

