Klaas-2 wrote:
Are you sending Content-Type headers with appropriate charset
indicated? Is your xml fully-escpaed in your update message?
...no, actually I simply make a
URLConnection conn = url.openConnection();
conn.setRequestProperty("ContentType", "text/xml");
conn.setDoOutput(true);
wr = new OutputStreamWriter(conn.getOutputStream());
wr.write(data);
wr.flush();
to post del add xml and my XML is embedded in a CData without further
escaping... have I to to something else.
I'm getting data from a MySQL db and I found some problems where in
retrieving data from there.
I've made some step forword connecting to the db with
"characterEncodingutf8" in the jdbc URL, and then converting with:
new String(mysqlXMLField.getBytes("latin1"));
If you use "characterEncodingutf8", then I think you'll get back a
stream of UTF-8 bytes from the DB.
I don't know what mysqlXMLField's type is (from above), but you
should start with the array of bytes returned from the JDBC call, and
then create the string from this array using "UTF-8" as the encoding
name. Or just use those bytes directly when writing out the XML.
But I'm really not into charsets and encodings...
The best thing to do is:
1. Make sure the XML you send to Solr starts with this line:
<?xml version="1.0" encoding="utf-8"?>
2. Make sure you've converted all of the text in the XML fields to
the UTF-8 character set.
Then don't wrap those fields with CDATA.
-- Ken
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"