This is how the whole process looks like -

1. I have a web page that I want to index. So I first copy that web page,
breaking it down to different section, and store it in mysql into different
column
2. I then wrote a small PHP script that draw all the value from all the
fields from mysql and then write it into an xml file
3. I then use solr to index this xml file, and the error that appears half
way during indexing is - "FATAL: Connection error (is Solr running at
http://localhost/solr/update
?): java.io.IOException: Server returned HTTP Response code: 500 for URL:
http://local/solr/update";
4.Although the error code doesnt specify is XML utf-8 code error, but I did
a bit research, and look at the XML file that i have, it doesn't fulfill the
utf-8 encoding

I have been trying these for couple of hours, but still to no avail. I would
like to find out
1. How to know the webpage that I copy into my mysql is what coding?
2. at what point of this whole process should I convert it to UTF-8? I tried
change the collation in mysql for all the columns to UTF-8 from
latin1-swedish, but it still doesnt work

Thanks!!!!

On 6/9/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

> Thought this is not directly related to Solr, but I have a XML output
from
> mysql database, but during indexing the XML output is not working. And
the
> problem is part of the XML output is not in UTF-8 encoding, how can I
> convert it to UTF-8 and how do I know what kind of coding it uses in the
> first place (the data I export from the mysql database). Thanks!

How do you generate XML output? "Output" itself is usually a raw byte
array, it uses "Transport" and "Encoding". If you save it in a file
system and forget about "transport-layer-encoding" you will get some
new problems...

> during indexing the XML output is not working
- what exactly happens, which kind of error messages?



Reply via email to