I just add: Solr's XML files are parsed according to XML spec, so you can choose any charset, you only have to define it according to XML spec! Also XML POST to updatehandler can be any encoding (it does not need to be declared in header anymore, the <?xml...> header is fine). There is already a test! I Fixed all this in endless sessions, but I was happy to do it, as my favourite data format is: XML :-) [I refuse to fix this for DIH, but that's another story, SOLR-2347].
Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [email protected] > -----Original Message----- > From: [email protected] [mailto:[email protected]] On Behalf Of Yonik > Seeley > Sent: Thursday, July 05, 2012 5:43 PM > To: [email protected] > Subject: Re: Question about solr config files encoding. > > On Thu, Jul 5, 2012 at 10:59 AM, Dawid Weiss <[email protected]> > wrote: > > According to JSON RFC: > > > > http://tools.ietf.org/html/rfc4627#section-3 > > > > JSON text SHALL be encoded in Unicode. > > One of my little pet peeves with the RFC - I think this was a bad requirement. > JSON should have been text, and then their should have been an optional way > to detect encoding if other mechanisms don't cover it (like HTTP headers, etc). > This effectively means that something like ["hi"] is not valid JSON for many of > you reading this email (if your email client is internally representing it as > something other than unicode encoded for example). > > > > We could just enforce/require UTF-8? > > Yes, Solr has normally always required/assumed UTF-8 for config files. > It's simply an oversight in any places that don't. > > -Yonik > http://lucidimagination.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] For additional > commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
