RE: [dspace-tech] CDATA use for imports

2020-05-18 Thread Poulter, Dale
Mark,

Just to close the loop.  Thanks for your help.  I ended up just creating a list 
of entities and their replacements which appears to have worked -- at least in 
testing.


-Dale

-Original Message-
From: Poulter, Dale 
Sent: Monday, May 18, 2020 11:47 AM
To: Mark H. Wood ; DSpace Technical Support 

Subject: RE: [dspace-tech] CDATA use for imports

Mark,

Thanks for the reply.  The information is being pulled from a MySQL database.  
These are old ETD entries that were entered into the system by students.  We 
are pulling the specific fields to create the Dublin_core.xml file ingest file. 
 


-Dale

-Original Message-
From: Mark H. Wood,UL 0115A,+1 317 274 0749,  On Behalf Of 
Mark H. Wood
Sent: Monday, May 18, 2020 10:18 AM
To: DSpace Technical Support 
Subject: Re: [dspace-tech] CDATA use for imports

On Mon, May 18, 2020 at 01:11:17PM +, Poulter, Dale wrote:
> We are migrating several items from an older system to DSpace using the 
> simple item import.  As is often the case with older systems,  the data is 
> not as clean as we would like.  As a result several items fail due to bad 
> html (open tags no closing tags, and a few diacritic issues).  One way to 
> allow the data to migration is to wrap the text in  .  
> However, it appears the import ignores anything in the CDATA section.  Is 
> this expected behavior?

I assume that it was a typo, but a CDATA section opens with "

Re: [dspace-tech] CDATA use for imports

2020-05-18 Thread Mark H. Wood
On Mon, May 18, 2020 at 04:46:50PM +, Poulter, Dale wrote:
> Thanks for the reply.  The information is being pulled from a MySQL database. 
>  These are old ETD entries that were entered into the system by students.  We 
> are pulling the specific fields to create the Dublin_core.xml file ingest 
> file.  

OK, so the problem is with metadata values.  I haven't yet found a
list of which fields can be marked up, but here's a link to which
elements can be used.  Note that this only applies to XMLUI -- I don't
know what JSPUI will do with marked-up metadata.

  https://wiki.lyrasis.org/display/DSDOC6x/Simple+HTML+Fragment+Markup

I haven't found my lists of things to be fixed up when building
batches.  Each source of batch input seems to come with its own set of
problems anyway.  I usually have to build batches, do a test (-d)
ingestion, see what is rejected, add a rule, and repeat until the test
runs without error.

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/20200518193917.GA29594%40IUPUI.Edu.


signature.asc
Description: PGP signature


RE: [dspace-tech] CDATA use for imports

2020-05-18 Thread Poulter, Dale
Mark,

Thanks for the reply.  The information is being pulled from a MySQL database.  
These are old ETD entries that were entered into the system by students.  We 
are pulling the specific fields to create the Dublin_core.xml file ingest file. 
 


-Dale

-Original Message-
From: Mark H. Wood,UL 0115A,+1 317 274 0749,  On Behalf Of 
Mark H. Wood
Sent: Monday, May 18, 2020 10:18 AM
To: DSpace Technical Support 
Subject: Re: [dspace-tech] CDATA use for imports

On Mon, May 18, 2020 at 01:11:17PM +, Poulter, Dale wrote:
> We are migrating several items from an older system to DSpace using the 
> simple item import.  As is often the case with older systems,  the data is 
> not as clean as we would like.  As a result several items fail due to bad 
> html (open tags no closing tags, and a few diacritic issues).  One way to 
> allow the data to migration is to wrap the text in  .  
> However, it appears the import ignores anything in the CDATA section.  Is 
> this expected behavior?

I assume that it was a typo, but a CDATA section opens with "

Re: [dspace-tech] CDATA use for imports

2020-05-18 Thread Mark H. Wood
On Mon, May 18, 2020 at 01:11:17PM +, Poulter, Dale wrote:
> We are migrating several items from an older system to DSpace using the 
> simple item import.  As is often the case with older systems,  the data is 
> not as clean as we would like.  As a result several items fail due to bad 
> html (open tags no closing tags, and a few diacritic issues).  One way to 
> allow the data to migration is to wrap the text in  .  
> However, it appears the import ignores anything in the CDATA section.  Is 
> this expected behavior?

I assume that it was a typo, but a CDATA section opens with