This is how the whole process looks like - 1. I have a web page that I want to index. So I first copy that web page, breaking it down to different section, and store it in mysql into different column 2. I then wrote a small PHP script that draw all the value from all the fields from mysql and then write it into an xml file 3. I then use solr to index this xml file, and the error that appears half way during indexing is - "FATAL: Connection error (is Solr running at http://localhost/solr/update ?): java.io.IOException: Server returned HTTP Response code: 500 for URL: http://local/solr/update" 4.Although the error code doesnt specify is XML utf-8 code error, but I did a bit research, and look at the XML file that i have, it doesn't fulfill the utf-8 encoding
I have been trying these for couple of hours, but still to no avail. I would like to find out 1. How to know the webpage that I copy into my mysql is what coding? 2. at what point of this whole process should I convert it to UTF-8? I tried change the collation in mysql for all the columns to UTF-8 from latin1-swedish, but it still doesnt work Thanks!!!! On 6/9/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> Thought this is not directly related to Solr, but I have a XML output from > mysql database, but during indexing the XML output is not working. And the > problem is part of the XML output is not in UTF-8 encoding, how can I > convert it to UTF-8 and how do I know what kind of coding it uses in the > first place (the data I export from the mysql database). Thanks! How do you generate XML output? "Output" itself is usually a raw byte array, it uses "Transport" and "Encoding". If you save it in a file system and forget about "transport-layer-encoding" you will get some new problems... > during indexing the XML output is not working - what exactly happens, which kind of error messages?