Re: How can i omit the illegal characters,when indexing the docs?

2009-01-04 Thread Peter Wolanin
For documents we are indexing via the PHP client, we are currently using the following regex to strip control characters from each field that might contain them: function apachesolr_strip_ctl_chars($text) { // See: http://w3.org/International/questions/qa-forms-utf-8.html // Printable utf-8

How can i omit the illegal characters,when indexing the docs?

2009-01-02 Thread RaghavPrabhu
Hi all, I am extracting the word document using Apache POI,then generate the xml doc,which is the document that i want to indexing in the solr. The problem which i faced was,it thrown the error in the browser is shown below. HTTP Status 500 - Illegal character ((CTRL-CHAR, code 8)) at