Hi,
Did the re-export - review csv - re-import drill - the importer stated it
found no changes. The only thing left to try at the moment without hacking
the database directly (Thanks for the tip, Steve!) would be 1) export to csv
2) populate the unwanted fields with some mock data 3) re-import 4) export
again 5) remove the mock data 6) re-import and hope that DSpace finally
decides to remove those fields.
T
2011/2/7 Stuart Lewis <[email protected]>
> Hi Timo,
>
> Thanks for the update - it will be interesting to hear how your re-import
> goes. Since there are no spaces in the CSV (these would, I think, although
> I've not had time to confirm it, be translated into empty strings once they
> get into the database), I'm not sure what could cause this. Language field
> shouldn't be an issue, as one of the things the CSV has been used most for
> by people is to fix up incorrect language codes.
>
> The only bug we've encountered (and fixed) recently, which looks like it
> has been in DSpace for many many previous versions, is a problem with
> schemas. If you had dc.decription.abstract and xyz.description.abstract,
> then asking DSpace for the values of one would get the values of both. But
> from your description I don't think this is the case for you.
>
> If you do find anything that causes it, please get in touch and we'll try
> and work out what is going on.
>
> Thanks,
>
>
> Stuart Lewis
> Digital Development Manager
> Te Tumu Herenga The University of Auckland Library
> Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
> Ph: +64 (0)9 373 7599 x81928
>
>
>
> On 7/02/2011, at 9:07 PM, Timo Aalto wrote:
>
> > Hi Stuart,
> >
> > 2011/2/4 Stuart Lewis <[email protected]>
> > Hi Timo,
> >
> > > I did some metadata gardening using the CSV bulk editor and as I
> uploaded my changes I noticed that the bulk metadata importer added an empty
> dc.description.abstract (with a 'fi' language identifier that is the default
> for our installation nowadays) into just about every record I had in that
> csv. (see: https://helda.helsinki.fi/handle/10138/24620?show=full for
> example). Mea culpa in that regard that I should have noticed this before
> pressing the yes key, but I still think there should be a some kind of a
> sanity check on the importer about not creating empty metadata value fields.
> >
> > There is a sanity check for this, so I'm not sure how it could have
> happened:
> >
> > -
> http://scm.dspace.org/trac/dspace/browser/dspace/trunk/dspace-api/src/main/java/org/dspace/app/bulkedit/DSpaceCSV.java#L481
> >
> > Is the field definitely empty - could it have contained something like a
> single space?
> > The field is definitely empty. I checked the csv for any spaces between
> commas and there is none, also a SQL query
> >
> > SELECT text_value, text_lang from metadatavalue where
> metadata_field_id=27 and text_value='' and text_lang='fi';
> >
> > returns 392 rows, about the amount of affected Items. The same query with
> one or more spaces in text_value returns zero.
> >
> > Just a thought - the CSV contained Items orginating from different DSpace
> instances that have had different default metadata value languages set at
> different times (ie. there are abstracts written in finnish but marked with
> 'en' language flag) - now the default metadata value language is set to
> finnish. Could it be possible that DSpace does some ill-advised assumptions
> based on the default metadata language?
> >
> > Meanwhile I will try to re-export, edit and re-import the affected Items
> if that would get rid of the empties...
> >
> >
> > It looks like Item.java, when ingesting metadata, performs a trim() on
> the values, to remove any leading or trailing spaces:
> >
> > -
> http://scm.dspace.org/trac/dspace/browser/dspace/trunk/dspace-api/src/main/java/org/dspace/content/Item.java#L684
> >
> > The CSV importer does not perform a trim(). I'll need to test this to
> see if it could be a problem, and if it does cause the problem you're
> seeing, then we'll need to make sure the CSV importer performs a trim too,
> so that it doesn't pass on a single space (or multiple spaces) to Item.java,
> which then trims them and archives them.
> >
> > Thanks,
> >
> >
> > Stuart Lewis
> > Digital Development Manager
> > Te Tumu Herenga The University of Auckland Library
> > Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
> > Ph: +64 (0)9 373 7599 x81928
> >
> >
> >
> >
> > --
> > --
> > Timo Aalto
> > Planning Officer
> > University of Helsinki Library
> > timo dot j dot aalto at helsinki dot fi
>
>
>
--
--
Timo Aalto
Planning Officer
University of Helsinki Library
timo dot j dot aalto at helsinki dot fi
------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech