[jira] [Updated] (MCHANGELOG-142) UTF-8 Encoding doubled

2016-10-15 Thread Michael Osipov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MCHANGELOG-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Osipov updated MCHANGELOG-142:
--
Labels: sample-project-missing  (was: )

> UTF-8 Encoding doubled
> --
>
> Key: MCHANGELOG-142
> URL: https://issues.apache.org/jira/browse/MCHANGELOG-142
> Project: Maven Changelog Plugin
>  Issue Type: Bug
>Affects Versions: 2.3
>Reporter: Jukka Harkki
>  Labels: sample-project-missing
>
> Creating changelog.xml file doubles UTF-8 encoding if the git comment 
> information is already UTF-8 format. For example: if property outputEncoding 
> is set to ISO-8859-1 the output is (shown as od dump):
> {code}
> 0004060 7375 7420 696f 696d 616d 6e61 6d20 c379
>   u   s   t   o   i   m   i   m   a   a   n   m   y   ├
> 0004100 73b6 6c20 7369 a4c3 6b79 6573 7373 a4c3
>   Â   s   l   i   s   ├   ñ   y   k   s   e   s   s   ├   ñ
> {code}
> And when set to UTF-8 the output is:
> {code}
> 0004060 6d69 6d69 6161 206e 796d 83c3 b6c2 2073
>   i   m   i   m   a   a   n   m   y   ├   â   ┬   Â   s
> {code}
> The result of UTF-8 encoding is that scandinavian umlauts are garbled. Code 
> C3 B6 is the right for the "ö"-letter.
> The ISO-8859-1 format would do for the site documentation but since the file 
> changelog.xml header says ISO-8859-1 encoding, rest of the process fails to 
> process umlauts.
> I modified class ChangeLogReport method writeChangelogXml() by commenting out 
> issue MCHANGELOG-86 writer change:
> {code}
> PrintWriter pw = new PrintWriter(new BufferedOutputStream(new 
> FileOutputStream(outputXML)));
> pw.write(changelogXml.toString());
> pw.flush();
> pw.close();
> // MCHANGELOG-86
> //Writer writer = WriterFactory.newWriter( new BufferedOutputStream( 
> new FileOutputStream( outputXML ) ),
> // getOutputEncoding() );
> //writer.write(changelogXml.toString());
> //writer.flush();
> //writer.close();
> {code}
> It might be there is double escaping in Writer since couple of lines above 
> the change set is created with encoding information:
> {code}
> String changeset = changelogSet.toXML(getOutputEncoding());
> {code}
> However, this is just a wild guess since I did not check out implementation 
> of changelogSet.toXML() or writer.write(). It could be also something 
> different in version control access since MCHANGELOG-86 was a SVN issue and 
> here we got with GIT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MCHANGELOG-142) UTF-8 Encoding doubled

2016-02-04 Thread Jukka Harkki (JIRA)

 [ 
https://issues.apache.org/jira/browse/MCHANGELOG-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jukka Harkki updated MCHANGELOG-142:

Description: 
Creating changelog.xml file doubles UTF-8 encoding if the git comment 
information is already UTF-8 format. For example: if property outputEncoding is 
set to ISO-8859-1 the output is (shown as od dump):
{code}
0004060 7375 7420 696f 696d 616d 6e61 6d20 c379
  u   s   t   o   i   m   i   m   a   a   n   m   y   ├
0004100 73b6 6c20 7369 a4c3 6b79 6573 7373 a4c3
  Â   s   l   i   s   ├   ñ   y   k   s   e   s   s   ├   ñ
{code}
And when set to UTF-8 the output is:
{code}
0004060 6d69 6d69 6161 206e 796d 83c3 b6c2 2073
  i   m   i   m   a   a   n   m   y   ├   â   ┬   Â   s
{code}
The result of UTF-8 encoding is that scandinavian umlauts are garbled. Code C3 
B6 is the right for "ö"-letter.

The ISO-8859-1 format would do for the site documentation otherwise but since 
the file xml header says ISO-8859-1 encoding, rest of the process fails.

I modified class ChangeLogReport method writeChangelogXml() by commenting out 
issue MCHANGELOG-86 writer change:
{code}
PrintWriter pw = new PrintWriter(new BufferedOutputStream(new 
FileOutputStream(outputXML)));
pw.write(changelogXml.toString());
pw.flush();
pw.close();
// MCHANGELOG-86
//Writer writer = WriterFactory.newWriter( new BufferedOutputStream( 
new FileOutputStream( outputXML ) ),
// getOutputEncoding() );
//writer.write(changelogXml.toString());
//writer.flush();
//writer.close();
{code}

It might be there is double escaping in Writer since couple of lines above the 
change set is created with encoding information. However, this is just a wild 
guess since I did not check out implementation of changelogSet.toXML() or 
writer.write()
{code}
String changeset = changelogSet.toXML(getOutputEncoding());
{code}


  was:
Creating changelog.xml file doubles UTF-8 encoding if the git comment 
information is already UTF-8 format. For example: if property outputEncoding is 
set to ISO-8859-1 the output is (od dump):
{code}
0004060 7375 7420 696f 696d 616d 6e61 6d20 c379
  u   s   t   o   i   m   i   m   a   a   n   m   y   ├
0004100 73b6 6c20 7369 a4c3 6b79 6573 7373 a4c3
  Â   s   l   i   s   ├   ñ   y   k   s   e   s   s   ├   ñ
{code}
And when set to UTF-8 the output is:
{code}
0004060 6d69 6d69 6161 206e 796d 83c3 b6c2 2073
  i   m   i   m   a   a   n   m   y   ├   â   ┬   Â   s
{code}
The result of UTF-8 encoding is that scandinavian umlauts are garbled. Code C3 
B6 is the right for "ö"-letter.

The ISO-8859-1 format would do for the site documentation otherwise but since 
the file xml header says ISO-8859-1 encoding, rest of the process fails.

I modified class ChangeLogReport method writeChangelogXml() by commenting out 
issue MCHANGELOG-86 writer change:
{code}
PrintWriter pw = new PrintWriter(new BufferedOutputStream(new 
FileOutputStream(outputXML)));
pw.write(changelogXml.toString());
pw.flush();
pw.close();
// MCHANGELOG-86
//Writer writer = WriterFactory.newWriter( new BufferedOutputStream( 
new FileOutputStream( outputXML ) ),
// getOutputEncoding() );
//writer.write(changelogXml.toString());
//writer.flush();
//writer.close();
{code}

It might be there is double escaping in Writer since couple of lines above the 
change set is created with encoding information. However, this is just a wild 
guess since I did not check out implementation of changelogSet.toXML() or 
writer.write()
{code}
String changeset = changelogSet.toXML(getOutputEncoding());
{code}



> UTF-8 Encoding doubled
> --
>
> Key: MCHANGELOG-142
> URL: https://issues.apache.org/jira/browse/MCHANGELOG-142
> Project: Maven Changelog Plugin
>  Issue Type: Bug
>Affects Versions: 2.3
>Reporter: Jukka Harkki
>
> Creating changelog.xml file doubles UTF-8 encoding if the git comment 
> information is already UTF-8 format. For example: if property outputEncoding 
> is set to ISO-8859-1 the output is (shown as od dump):
> {code}
> 0004060 7375 7420 696f 696d 616d 6e61 6d20 c379
>   u   s   t   o   i   m   i   m   a   a   n   m   y   ├
> 0004100 73b6 6c20 7369 a4c3 6b79 6573 7373 a4c3
>   Â   s   l   i   s   ├   ñ   y   k   s   e   s   s   ├   ñ
> {code}
> And when set to UTF-8 the output is:
> {code}
> 0004060 6d69 6d69 6161 206e 796d 83c3 b6c2 2073
>   i   m   i   m   a   a   n   m   y   ├   â   ┬   Â   s
> {code}
> The result of UTF-8 encoding is that scandinavian umlauts are garbled. Code 
> C3 B6 is the right for 

[jira] [Updated] (MCHANGELOG-142) UTF-8 Encoding doubled

2016-02-04 Thread Jukka Harkki (JIRA)

 [ 
https://issues.apache.org/jira/browse/MCHANGELOG-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jukka Harkki updated MCHANGELOG-142:

Description: 
Creating changelog.xml file doubles UTF-8 encoding if the git comment 
information is already UTF-8 format. For example: if property outputEncoding is 
set to ISO-8859-1 the output is (shown as od dump):
{code}
0004060 7375 7420 696f 696d 616d 6e61 6d20 c379
  u   s   t   o   i   m   i   m   a   a   n   m   y   ├
0004100 73b6 6c20 7369 a4c3 6b79 6573 7373 a4c3
  Â   s   l   i   s   ├   ñ   y   k   s   e   s   s   ├   ñ
{code}
And when set to UTF-8 the output is:
{code}
0004060 6d69 6d69 6161 206e 796d 83c3 b6c2 2073
  i   m   i   m   a   a   n   m   y   ├   â   ┬   Â   s
{code}
The result of UTF-8 encoding is that scandinavian umlauts are garbled. Code C3 
B6 is the right for the "ö"-letter.

The ISO-8859-1 format would do for the site documentation but since the file 
changelog.xml header says ISO-8859-1 encoding, rest of the process fails to 
process umlauts.

I modified class ChangeLogReport method writeChangelogXml() by commenting out 
issue MCHANGELOG-86 writer change:
{code}
PrintWriter pw = new PrintWriter(new BufferedOutputStream(new 
FileOutputStream(outputXML)));
pw.write(changelogXml.toString());
pw.flush();
pw.close();
// MCHANGELOG-86
//Writer writer = WriterFactory.newWriter( new BufferedOutputStream( 
new FileOutputStream( outputXML ) ),
// getOutputEncoding() );
//writer.write(changelogXml.toString());
//writer.flush();
//writer.close();
{code}

It might be there is double escaping in Writer since couple of lines above the 
change set is created with encoding information. However, this is just a wild 
guess since I did not check out implementation of changelogSet.toXML() or 
writer.write()
{code}
String changeset = changelogSet.toXML(getOutputEncoding());
{code}


  was:
Creating changelog.xml file doubles UTF-8 encoding if the git comment 
information is already UTF-8 format. For example: if property outputEncoding is 
set to ISO-8859-1 the output is (shown as od dump):
{code}
0004060 7375 7420 696f 696d 616d 6e61 6d20 c379
  u   s   t   o   i   m   i   m   a   a   n   m   y   ├
0004100 73b6 6c20 7369 a4c3 6b79 6573 7373 a4c3
  Â   s   l   i   s   ├   ñ   y   k   s   e   s   s   ├   ñ
{code}
And when set to UTF-8 the output is:
{code}
0004060 6d69 6d69 6161 206e 796d 83c3 b6c2 2073
  i   m   i   m   a   a   n   m   y   ├   â   ┬   Â   s
{code}
The result of UTF-8 encoding is that scandinavian umlauts are garbled. Code C3 
B6 is the right for "ö"-letter.

The ISO-8859-1 format would do for the site documentation otherwise but since 
the file xml header says ISO-8859-1 encoding, rest of the process fails.

I modified class ChangeLogReport method writeChangelogXml() by commenting out 
issue MCHANGELOG-86 writer change:
{code}
PrintWriter pw = new PrintWriter(new BufferedOutputStream(new 
FileOutputStream(outputXML)));
pw.write(changelogXml.toString());
pw.flush();
pw.close();
// MCHANGELOG-86
//Writer writer = WriterFactory.newWriter( new BufferedOutputStream( 
new FileOutputStream( outputXML ) ),
// getOutputEncoding() );
//writer.write(changelogXml.toString());
//writer.flush();
//writer.close();
{code}

It might be there is double escaping in Writer since couple of lines above the 
change set is created with encoding information. However, this is just a wild 
guess since I did not check out implementation of changelogSet.toXML() or 
writer.write()
{code}
String changeset = changelogSet.toXML(getOutputEncoding());
{code}



> UTF-8 Encoding doubled
> --
>
> Key: MCHANGELOG-142
> URL: https://issues.apache.org/jira/browse/MCHANGELOG-142
> Project: Maven Changelog Plugin
>  Issue Type: Bug
>Affects Versions: 2.3
>Reporter: Jukka Harkki
>
> Creating changelog.xml file doubles UTF-8 encoding if the git comment 
> information is already UTF-8 format. For example: if property outputEncoding 
> is set to ISO-8859-1 the output is (shown as od dump):
> {code}
> 0004060 7375 7420 696f 696d 616d 6e61 6d20 c379
>   u   s   t   o   i   m   i   m   a   a   n   m   y   ├
> 0004100 73b6 6c20 7369 a4c3 6b79 6573 7373 a4c3
>   Â   s   l   i   s   ├   ñ   y   k   s   e   s   s   ├   ñ
> {code}
> And when set to UTF-8 the output is:
> {code}
> 0004060 6d69 6d69 6161 206e 796d 83c3 b6c2 2073
>   i   m   i   m   a   a   n   m   y   ├   â   ┬   Â   s
> {code}
> The result of UTF-8 encoding is that scandinavian umlauts are 

[jira] [Updated] (MCHANGELOG-142) UTF-8 Encoding doubled

2016-02-04 Thread Jukka Harkki (JIRA)

 [ 
https://issues.apache.org/jira/browse/MCHANGELOG-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jukka Harkki updated MCHANGELOG-142:

Description: 
Creating changelog.xml file doubles UTF-8 encoding if the git comment 
information is already UTF-8 format. For example: if property outputEncoding is 
set to ISO-8859-1 the output is (shown as od dump):
{code}
0004060 7375 7420 696f 696d 616d 6e61 6d20 c379
  u   s   t   o   i   m   i   m   a   a   n   m   y   ├
0004100 73b6 6c20 7369 a4c3 6b79 6573 7373 a4c3
  Â   s   l   i   s   ├   ñ   y   k   s   e   s   s   ├   ñ
{code}
And when set to UTF-8 the output is:
{code}
0004060 6d69 6d69 6161 206e 796d 83c3 b6c2 2073
  i   m   i   m   a   a   n   m   y   ├   â   ┬   Â   s
{code}
The result of UTF-8 encoding is that scandinavian umlauts are garbled. Code C3 
B6 is the right for the "ö"-letter.

The ISO-8859-1 format would do for the site documentation but since the file 
changelog.xml header says ISO-8859-1 encoding, rest of the process fails to 
process umlauts.

I modified class ChangeLogReport method writeChangelogXml() by commenting out 
issue MCHANGELOG-86 writer change:
{code}
PrintWriter pw = new PrintWriter(new BufferedOutputStream(new 
FileOutputStream(outputXML)));
pw.write(changelogXml.toString());
pw.flush();
pw.close();
// MCHANGELOG-86
//Writer writer = WriterFactory.newWriter( new BufferedOutputStream( 
new FileOutputStream( outputXML ) ),
// getOutputEncoding() );
//writer.write(changelogXml.toString());
//writer.flush();
//writer.close();
{code}

It might be there is double escaping in Writer since couple of lines above the 
change set is created with encoding information:
{code}
String changeset = changelogSet.toXML(getOutputEncoding());
{code}
However, this is just a wild guess since I did not check out implementation of 
changelogSet.toXML() or writer.write(). It could be also something different in 
version control access since MCHANGELOG-86 was a SVN issue and here we got with 
GIT.

  was:
Creating changelog.xml file doubles UTF-8 encoding if the git comment 
information is already UTF-8 format. For example: if property outputEncoding is 
set to ISO-8859-1 the output is (shown as od dump):
{code}
0004060 7375 7420 696f 696d 616d 6e61 6d20 c379
  u   s   t   o   i   m   i   m   a   a   n   m   y   ├
0004100 73b6 6c20 7369 a4c3 6b79 6573 7373 a4c3
  Â   s   l   i   s   ├   ñ   y   k   s   e   s   s   ├   ñ
{code}
And when set to UTF-8 the output is:
{code}
0004060 6d69 6d69 6161 206e 796d 83c3 b6c2 2073
  i   m   i   m   a   a   n   m   y   ├   â   ┬   Â   s
{code}
The result of UTF-8 encoding is that scandinavian umlauts are garbled. Code C3 
B6 is the right for the "ö"-letter.

The ISO-8859-1 format would do for the site documentation but since the file 
changelog.xml header says ISO-8859-1 encoding, rest of the process fails to 
process umlauts.

I modified class ChangeLogReport method writeChangelogXml() by commenting out 
issue MCHANGELOG-86 writer change:
{code}
PrintWriter pw = new PrintWriter(new BufferedOutputStream(new 
FileOutputStream(outputXML)));
pw.write(changelogXml.toString());
pw.flush();
pw.close();
// MCHANGELOG-86
//Writer writer = WriterFactory.newWriter( new BufferedOutputStream( 
new FileOutputStream( outputXML ) ),
// getOutputEncoding() );
//writer.write(changelogXml.toString());
//writer.flush();
//writer.close();
{code}

It might be there is double escaping in Writer since couple of lines above the 
change set is created with encoding information. However, this is just a wild 
guess since I did not check out implementation of changelogSet.toXML() or 
writer.write()
{code}
String changeset = changelogSet.toXML(getOutputEncoding());
{code}



> UTF-8 Encoding doubled
> --
>
> Key: MCHANGELOG-142
> URL: https://issues.apache.org/jira/browse/MCHANGELOG-142
> Project: Maven Changelog Plugin
>  Issue Type: Bug
>Affects Versions: 2.3
>Reporter: Jukka Harkki
>
> Creating changelog.xml file doubles UTF-8 encoding if the git comment 
> information is already UTF-8 format. For example: if property outputEncoding 
> is set to ISO-8859-1 the output is (shown as od dump):
> {code}
> 0004060 7375 7420 696f 696d 616d 6e61 6d20 c379
>   u   s   t   o   i   m   i   m   a   a   n   m   y   ├
> 0004100 73b6 6c20 7369 a4c3 6b79 6573 7373 a4c3
>   Â   s   l   i   s   ├   ñ   y   k   s   e   s   s   ├   ñ
> {code}
> And when set to UTF-8 the output is:
> {code}
> 0004060 6d69 6d69 6161 206e 796d 83c3 b6c2 2073