thanks, solved On 28 March 2011 09:30, Uwe Schindler <u...@thetaphi.de> wrote:
> Hi, > > Replace the "stupid": > writer = new BufferedWriter(new FileWriter(fileOutput)); > > by: > writer = new BufferedWriter(new OutputStreamWriter(new > FileOutputStream(fileOutput), "UTF-8")); > > Unfortunately, you cannot give a charset to FileWriter itself. > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -----Original Message----- > > From: Patrick Diviacco [mailto:patrick.divia...@gmail.com] > > Sent: Monday, March 28, 2011 9:17 AM > > To: java-user@lucene.apache.org > > Subject: Re: file formats: MacRoman and UTF-8... > > > > hi, I'm using my own code: > > > > > > > > Writer writer = null; > > > > try { > > //File fileOutput = new File("output.trectext"); File fileOutput = new > > File(args[1]); writer = new BufferedWriter(new FileWriter(fileOutput)); > > writer.write(contents.toString()); > > } catch (FileNotFoundException e) { > > e.printStackTrace(); > > } catch (IOException e) { > > e.printStackTrace(); > > } finally { > > try { > > if (writer != null) { > > writer.close(); > > } > > } catch (IOException e) { > > e.printStackTrace(); > > } > > } > > > > > > > > > > > > On 28 March 2011 09:13, Paul Libbrecht <p...@hoplahup.net> wrote: > > > > > java -Dfile.encoding=utf-8 > > > should do the trick. > > > > > > Or... which java app are you using? > > > > > > paul > > > > > > > > > Le 28 mars 2011 à 09:03, Patrick Diviacco a écrit : > > > > > > > When I run my Lucene app and a parse a xml file I get the following > > > > error due to some fonts such as "é" written in the text file. > > > > > > > > If I save the text file as UTF-8 with my text editor I don't have > > > > this issue, but when I create it with a java app, it is saved as > MacRoman. > > > > > > > > How can I specify a different format with Java instead ? > > > > > > > > thanks > > > > > > > > [CODE]Exception in thread "main" > > > > > > > > > com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceExcepti > > on: > > > > Invalid byte 1 of 1-byte UTF-8 sequence. > > > > at > > > > > > > com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8 > > > Reader.java:684) > > > > at > > > > > > > com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader. > > > java:554) > > > > at > > > > > > > com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntit > > > yScanner.java:1742) > > > > at > > > > > > > com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(XMLE > > > ntityScanner.java:1416) > > > > at > > > > > > > > > com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerIm > > pl > > > > > $FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:279 > > 2) > > > > at > > > > > > > > > com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(X > > M > > > LDocumentScannerImpl.java:648) > > > > at > > > > > > > > > com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerIm > > pl > > > .scanDocument(XMLDocumentFragmentScannerImpl.java:511) > > > > at > > > > > > > > > com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XM > > > L11Configuration.java:808) > > > > at > > > > > > > > > com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XM > > > L11Configuration.java:737) > > > > at > > > > > > > com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.j > > > ava:119) > > > > at > > > > > > > com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Abs > > > tractSAXParser.java:1205) > > > > at > > > > > > > > > com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.pa > > > rse(SAXParserImpl.java:522) > > > > at org.apache.commons.digester.Digester.parse(Digester.java:1871) > > > > at CollectionIndexer.main(CollectionIndexer.java:111)[/CODE] > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >