Hi all,

I've been writing numerous scripts to get ETD-DB data and files into a 
format that the Dspace importer can accept. I feel like I've been 
relatively successful until the importer crashes and I get this error:

Adding item from directory etd-05122005-082838
[Fatal Error] dublin_core.xml:9:111: Invalid byte 1 of 1-byte UTF-8 
sequence.
org.xml.sax.SAXParseException: Invalid byte 1 of 1-byte UTF-8 sequence.
        at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:208)
        at 
org.dspace.app.itemimport.ItemImport.loadXML(ItemImport.java:1269)
        at 
org.dspace.app.itemimport.ItemImport.loadDublinCore(ItemImport.java:795)
        at 
org.dspace.app.itemimport.ItemImport.loadMetadata(ItemImport.java:780)
        at org.dspace.app.itemimport.ItemImport.addItem(ItemImport.java:626)
        at 
org.dspace.app.itemimport.ItemImport.addItems(ItemImport.java:498)
        at org.dspace.app.itemimport.ItemImport.main(ItemImport.java:407)
org.xml.sax.SAXParseException: Invalid byte 1 of 1-byte UTF-8 sequence.


What seems to work is to copy all the dc data into a new xml file and 
run the importer again. I am having to do this on just about every other 
dublin_core.xml file, which could take quite a bit of time with over 500 
to import :)

Does anyone know why I'm getting this error? Could it be from an ftp of 
these xml files from one server to the other?

Here's the dublin core I've put together from the ETD-DB database using 
perl. I have CDATA tags which seem to be okay with the parser. Someone 
(who shall remain nameless :) thought it would be a fabulous idea to put 
HTML tags into the abstract field in the ETD-DB MySQL database! I guess 
they were thinking we wouldn't ever migrate to another system. Alas.

Thanks for your help,
Susan



<dublin_core>
<dcvalue element='contributor' qualifier='author'>Simsek, Yilmaz  </dcvalue>
<dcvalue element='description' qualifier='degree'>PHD</dcvalue>
<dcvalue element='description' qualifier='department'><![CDATA[Political 
Science &amp; Public Administration]]></dcvalue>
<dcvalue element='type' qualifier='none'>dissertation</dcvalue>
<dcvalue element='title' qualifier='none'>Impact of Terrorism on 
Migration Patterns in Turkey</dcvalue>
<dcvalue element='description' qualifier='abstract'><![CDATA[<p>This 
study is among the first studies that evaluate the social impacts of 
terrorism in a specific country for a 10 year period. It tests the 
effects of terrorism on domestic net-migration in Turkey, especially in 
the terror infected provinces of the Eastern and South Eastern regions 
of the country between the years 1992 and 2001. Terrorism has impacted 
people not only physically, but also psychologically. When faced with 
<93>future uncertainty<94> or the <93>fear of terrorism,<94> it is 
natural for people to leave their home towns, and to migrate to 
somewhere else where they feel safe. In order to explore the real impact 
of terrorism on immigration, this study used <93>terrorism incident 
rate<94> per 10,000 people and the <93>rate of people and security 
forces killed<94> per 10,000 people as independent variables. It also 
examined the major economic effects of migration; unemployment rate and 
the GDP were used as control variables. In addition, the rate of killed 
terrorists, population density, and the distance to Istanbul and to 
Mersin were also added to the models.</p> ^M
<p>A control-series regression analysis was performed to relate the 
terrorist incidents<92> impact on the citizens<92> inclinations to leave 
their home towns in all provinces and in high terrorism incident 
provinces of East and Southeast regions of Turkey. Results show that the 
net-migration in high terrorism incident provinces is higher than the 
net-migration in other provinces. Findings also confirm that there was a 
positive relationship between net-migration and terrorist incidents and 
that relationship was higher during 1992-1995, when the number of 
terrorist incidents hit its all time highest level. Other than terrorist 
incidents, results moreover confirm that net-migration is positively 
related to the number of "people and security forces killed".</p>^M
<p>In addition, results also confirm that population density and 
distance were related to net-migration. Economic variables, such as GDP 
and unemployment also related to net migration. However, their impacts 
varied from model to model. While the GDP was negatively related to 
net-migration in the models with all the provinces; unemployment was 
positively related to net-migration in the models with only high 
terrorism incident provinces.</p>]]></dcvalue>
<dcvalue element='date' qualifier='submitted'>2006</dcvalue>
<dcvalue element='identifier' 
qualifier='other'>http://etd.vcu.edu/theses/available/etd-08032006-131817/</dcvalue>
<dcvalue element='subject' qualifier='none'>PKK</dcvalue>
<dcvalue element='subject' qualifier='none'>Turkey</dcvalue>
<dcvalue element='subject' qualifier='none'>terrorism</dcvalue>
<dcvalue element='subject' qualifier='none'>conflict</dcvalue>
<dcvalue element='subject' qualifier='none'>migration</dcvalue>
<dcvalue element='subject' qualifier='none'>internally displaced 
persons</dcvalue>
</dublin_core>

-- 
Susan Teague Rector
Web Applications Manager
Library Information Systems, VCU Libraries
804.827.3554 | [EMAIL PROTECTED]



-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to