[
https://issues.apache.org/jira/browse/SOLR-38?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexandre Rafalovitch closed SOLR-38.
-------------------------------------
> PATCH: demonstrate correct handling of UTF-8 encoded input documents
> --------------------------------------------------------------------
>
> Key: SOLR-38
> URL: https://issues.apache.org/jira/browse/SOLR-38
> Project: Solr
> Issue Type: Improvement
> Components: update
> Reporter: Bertrand Delacretaz
> Priority: Minor
> Attachments: utf8-example.xml
>
>
> Here's an UTF-8 example with accented chars that can go in
> example/exampledocs, to demonstrate correct handling of accented chars.
> After posting this to SOLR, searching for "êâîôû" from
> http://localhost:8983/solr/admin/ correctly finds this document.
> Needs a small patch to example/exampledocs/post.sh (enclosed below), to
> specifiy the encoding for the POST.
> The XML pull parser seems to be able to handle the encoding declaration
> correctly, but if the encoding is not specified in the POST, the servlet
> container might get in the way (Jetty does with the current configuration).
> Index: example/exampledocs/post.sh
> ===================================================================
> --- example/exampledocs/post.sh (revision 424529)
> +++ example/exampledocs/post.sh (working copy)
> @@ -4,7 +4,7 @@
>
> for f in $FILES; do
> echo Posting file $f to $URL
> - curl $URL --data-binary @$f
> + curl $URL --data-binary @$f -H 'Content-type:text/xml; charset=utf-8'
> echo
> done
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]