Hi Did you get a chance to work on this? We are working on releasing Camel 2.17 so its time to step up if you want to have this issue resolved.
You are in better position to fix or track down the issue as you have the problem and uses the russian locale. On Wed, Mar 9, 2016 at 11:11 AM, Claus Ibsen <claus.ib...@gmail.com> wrote: > Hi > > Yeah would be good if you can try the suggestions from Antoine. And if > you can reproduce an unit test and possible provide a fix in a PR / > patch. We love contributions > http://camel.apache.org/contributing > > On Tue, Mar 8, 2016 at 12:53 AM, Antoine Toulme <anto...@toulme.name> wrote: >> What happens is that your default charset is win-1251 while the file is >> UTF-8. >> >> The file is read correctly according to the charset argument passed to the >> toInputStream method ; however, the default charset used to parse and send >> the stream is the default charset. >> >> The immediate workaround for you is to add an explicit charset when >> launching the JVM: -Dfile.encoding=UTF-8 >> >> I would recommend you go ahead, file a bug and add a simple test case in >> IOConverterTest around line 83. >> >>> On Mar 5, 2016, at 11:05 PM, fedd <feddkr...@hotmail.com> wrote: >>> >>> I made an experiment and saw that the situation is much worse that just >>> losing one frequent Russian letter. >>> >>> I made a UTF-8 file with both Russian text and one German A Umlaut letter, >>> and Camel was unable to read a German letter replacing it with a question >>> mark, just because my windows dev machine native charset happened to be >>> win-1251. >>> >>> I don't really think it's okay >>> >>> 1) to ever flatten Unicode strings to a single byte character set; >>> >>> 2) when the behaviour of the server side code depends on the host operating >>> system settings (becomes not portable) >>> >>> May I file a Jira bug report? >>> >>> Here's by route: >>> >>> <dataFormats> >>> <json id="jack" library="Jackson" prettyPrint="true"/> >>> </dataFormats> >>> >>> <route> >>> >>> <from >>> uri="file:///C:/tries/collApp/exchange/in?fileName=registerSampleUtf.csv&charset=UTF-8"/> >>> <log message="file: ${body.class.name} ${body}" >>> loggingLevel="WARN"/> >>> <unmarshal> >>> <csv delimiter=";" useMaps="true" /> >>> </unmarshal> >>> <log message="unmarshalled: ${body.class.name} ${body}" >>> loggingLevel="WARN"/> >>> <marshal ref="jack"/> >>> <log message="marshalled: ${body}" loggingLevel="WARN"/> >>> <to >>> uri="file:///C:/tries/collApp/exchange/out?fileName=out.json"/> >>> </route> >>> >>> At the first "log" only a German letter is replaced with the question mark. >>> >>> At the second, all Russian letters are replaced with the question marks. >>> >>> The resulting JSON can't even display the question marks when read in any of >>> the world's encodings. >>> >>> Shall I provide a test CSV file here? (warning: it contains Russian letters) >>> >>> >>> >>> -- >>> View this message in context: >>> http://camel.465427.n5.nabble.com/A-possible-bug-in-IOConverter-with-Win-1251-charset-tp5778665p5778666.html >>> Sent from the Camel Development mailing list archive at Nabble.com. >> > > > > -- > Claus Ibsen > ----------------- > http://davsclaus.com @davsclaus > Camel in Action 2: https://www.manning.com/ibsen2 -- Claus Ibsen ----------------- http://davsclaus.com @davsclaus Camel in Action 2: https://www.manning.com/ibsen2