[ http://jira.codehaus.org/browse/MNG-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=221887#action_221887 ]
Benjamin Bentmann commented on MNG-4667: ---------------------------------------- The need for encoding parameters smells, XML files have an encoding, either explicitly set via XML declaration or implicitly defaulted to UTF-8, a user should't have to re-specify that (which just enables the risk to specify the wrong encoding and mess up the file). Also, your patch handles BOM stripping at the byte (and not character) level, so I don't see a reason to involve readers/writers at all, a plain stream-to-stream copy (e.g. via IOUtils from plexus-utils) should do. If readers/writers are needed, it's worth looking at ReaderFactory.newXmlReader() from plexus-utils which handles XML encoding detection. > Maven 2.2.1 XML parser fails to parse a UTF-8 POM that begins with a BOM > ------------------------------------------------------------------------ > > Key: MNG-4667 > URL: http://jira.codehaus.org/browse/MNG-4667 > Project: Maven 2 & 3 > Issue Type: Bug > Components: POM::Encoding > Affects Versions: 2.2.1 > Reporter: Maria Catherine Tan > Attachments: MNG-4667-with-encoding.patch, MNG-4667.patch, pom.xml > > > I've seen a lot of issues related to this that were closed because they're a > duplicate of MNG-2254 but I think the fix for MNG-2254 doesn't fix this issue. > I'm using maven 2.2.1 and the build failed when the UTF-8 POM begins with a > BOM. > Here's the log when running clean install > {noformat} > Reason: Parse error reading POM. Reason: only whitespace content allowed > before start tag and not \uef (position: START_DOCUMENT seen \uef... @1:1) > for project unknown at /home/marica/quick/pom.xml > [INFO] > ------------------------------------------------------------------------ > [INFO] Trace > org.apache.maven.reactor.MavenExecutionException: Parse error reading POM. > Reason: only whitespace content allowed before start tag and not \uef > (position: START_DOCUMENT seen \uef... @1:1) for project unknown at > /home/marica/quick/pom.xml > at org.apache.maven.DefaultMaven.getProjects(DefaultMaven.java:404) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:272) > at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:138) > at org.apache.maven.cli.MavenCli.main(MavenCli.java:362) > at > org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:60) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:585) > at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315) > at org.codehaus.classworlds.Launcher.launch(Launcher.java:255) > at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430) > at org.codehaus.classworlds.Launcher.main(Launcher.java:375) > Caused by: org.apache.maven.project.InvalidProjectModelException: Parse error > reading POM. Reason: only whitespace content allowed before start tag and not > \uef (position: START_DOCUMENT seen \uef... @1:1) for project unknown at > /home/marica/quick/pom.xml > at > org.apache.maven.project.DefaultMavenProjectBuilder.readModel(DefaultMavenProjectBuilder.java:1610) > at > org.apache.maven.project.DefaultMavenProjectBuilder.readModel(DefaultMavenProjectBuilder.java:1571) > at > org.apache.maven.project.DefaultMavenProjectBuilder.buildFromSourceFileInternal(DefaultMavenProjectBuilder.java:506) > at > org.apache.maven.project.DefaultMavenProjectBuilder.build(DefaultMavenProjectBuilder.java:200) > at org.apache.maven.DefaultMaven.getProject(DefaultMaven.java:604) > at org.apache.maven.DefaultMaven.collectProjects(DefaultMaven.java:487) > at org.apache.maven.DefaultMaven.getProjects(DefaultMaven.java:391) > ... 12 more > Caused by: org.codehaus.plexus.util.xml.pull.XmlPullParserException: only > whitespace content allowed before start tag and not \uef (position: > START_DOCUMENT seen \uef... @1:1) > at > hidden.org.codehaus.plexus.util.xml.pull.MXParser.parseProlog(MXParser.java:1528) > at > hidden.org.codehaus.plexus.util.xml.pull.MXParser.nextImpl(MXParser.java:1407) > at > hidden.org.codehaus.plexus.util.xml.pull.MXParser.next(MXParser.java:1105) > at > org.apache.maven.model.io.xpp3.MavenXpp3Reader.read(MavenXpp3Reader.java:3911) > at > org.apache.maven.project.DefaultMavenProjectBuilder.readModel(DefaultMavenProjectBuilder.java:1606) > {noformat} -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira