hi, I'm using my own code:


Writer writer = null;

try {
//File fileOutput = new File("output.trectext");
File fileOutput = new File(args[1]);
writer = new BufferedWriter(new FileWriter(fileOutput));
writer.write(contents.toString());
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (writer != null) {
writer.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}





On 28 March 2011 09:13, Paul Libbrecht <p...@hoplahup.net> wrote:

> java -Dfile.encoding=utf-8
> should do the trick.
>
> Or... which java app are you using?
>
> paul
>
>
> Le 28 mars 2011 à 09:03, Patrick Diviacco a écrit :
>
> > When I run my Lucene app and a parse a xml file I get the following error
> > due to some fonts such as "é" written in the text file.
> >
> > If I save the text file as UTF-8 with my text editor I don't have this
> > issue, but when I create it with a java app, it is saved as MacRoman.
> >
> > How can I specify a different format with Java instead ?
> >
> > thanks
> >
> > [CODE]Exception in thread "main"
> >
> com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException:
> > Invalid byte 1 of 1-byte UTF-8 sequence.
> > at
> >
> com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
> > at
> >
> com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
> > at
> >
> com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)
> > at
> >
> com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(XMLEntityScanner.java:1416)
> > at
> >
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2792)
> > at
> >
> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
> > at
> >
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
> > at
> >
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
> > at
> >
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
> > at
> >
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
> > at
> >
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
> > at
> >
> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
> > at org.apache.commons.digester.Digester.parse(Digester.java:1871)
> > at CollectionIndexer.main(CollectionIndexer.java:111)[/CODE]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to