RE: [dspace-tech] CDATA use for imports
Mark, Just to close the loop. Thanks for your help. I ended up just creating a list of entities and their replacements which appears to have worked -- at least in testing. -Dale -Original Message- From: Poulter, Dale Sent: Monday, May 18, 2020 11:47 AM To: Mark H. Wood ; DSpace Technical Support Subject: RE: [dspace-tech] CDATA use for imports Mark, Thanks for the reply. The information is being pulled from a MySQL database. These are old ETD entries that were entered into the system by students. We are pulling the specific fields to create the Dublin_core.xml file ingest file. -Dale -Original Message- From: Mark H. Wood,UL 0115A,+1 317 274 0749, On Behalf Of Mark H. Wood Sent: Monday, May 18, 2020 10:18 AM To: DSpace Technical Support Subject: Re: [dspace-tech] CDATA use for imports On Mon, May 18, 2020 at 01:11:17PM +, Poulter, Dale wrote: > We are migrating several items from an older system to DSpace using the > simple item import. As is often the case with older systems, the data is > not as clean as we would like. As a result several items fail due to bad > html (open tags no closing tags, and a few diacritic issues). One way to > allow the data to migration is to wrap the text in . > However, it appears the import ignores anything in the CDATA section. Is > this expected behavior? I assume that it was a typo, but a CDATA section opens with "
Re: [dspace-tech] CDATA use for imports
On Mon, May 18, 2020 at 04:46:50PM +, Poulter, Dale wrote: > Thanks for the reply. The information is being pulled from a MySQL database. > These are old ETD entries that were entered into the system by students. We > are pulling the specific fields to create the Dublin_core.xml file ingest > file. OK, so the problem is with metadata values. I haven't yet found a list of which fields can be marked up, but here's a link to which elements can be used. Note that this only applies to XMLUI -- I don't know what JSPUI will do with marked-up metadata. https://wiki.lyrasis.org/display/DSDOC6x/Simple+HTML+Fragment+Markup I haven't found my lists of things to be fixed up when building batches. Each source of batch input seems to come with its own set of problems anyway. I usually have to build batches, do a test (-d) ingestion, see what is rejected, add a rule, and repeat until the test runs without error. -- Mark H. Wood Lead Technology Analyst University Library Indiana University - Purdue University Indianapolis 755 W. Michigan Street Indianapolis, IN 46202 317-274-0749 www.ulib.iupui.edu -- All messages to this mailing list should adhere to the DuraSpace Code of Conduct: https://duraspace.org/about/policies/code-of-conduct/ --- You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-tech/20200518193917.GA29594%40IUPUI.Edu. signature.asc Description: PGP signature
RE: [dspace-tech] CDATA use for imports
Mark, Thanks for the reply. The information is being pulled from a MySQL database. These are old ETD entries that were entered into the system by students. We are pulling the specific fields to create the Dublin_core.xml file ingest file. -Dale -Original Message- From: Mark H. Wood,UL 0115A,+1 317 274 0749, On Behalf Of Mark H. Wood Sent: Monday, May 18, 2020 10:18 AM To: DSpace Technical Support Subject: Re: [dspace-tech] CDATA use for imports On Mon, May 18, 2020 at 01:11:17PM +, Poulter, Dale wrote: > We are migrating several items from an older system to DSpace using the > simple item import. As is often the case with older systems, the data is > not as clean as we would like. As a result several items fail due to bad > html (open tags no closing tags, and a few diacritic issues). One way to > allow the data to migration is to wrap the text in . > However, it appears the import ignores anything in the CDATA section. Is > this expected behavior? I assume that it was a typo, but a CDATA section opens with "
Re: [dspace-tech] CDATA use for imports
On Mon, May 18, 2020 at 01:11:17PM +, Poulter, Dale wrote: > We are migrating several items from an older system to DSpace using the > simple item import. As is often the case with older systems, the data is > not as clean as we would like. As a result several items fail due to bad > html (open tags no closing tags, and a few diacritic issues). One way to > allow the data to migration is to wrap the text in . > However, it appears the import ignores anything in the CDATA section. Is > this expected behavior? I assume that it was a typo, but a CDATA section opens with