[GitHub] [incubator-daffodil] olabusayoT commented on a change in pull request #254: Adds hex/utf-8 data dump on left over data

GitBox Fri, 28 Jun 2019 05:12:21 -0700

olabusayoT commented on a change in pull request #254: Adds hex/utf-8 data dump 
on left over data
URL: https://github.com/apache/incubator-daffodil/pull/254#discussion_r298567599


 ##########
 File path: daffodil-io/src/main/scala/org/apache/daffodil/io/Dump.scala
 ##########
 @@ -598,13 +599,13 @@ class DataDumper {
 
     val endByteAddress0b = math.max(startByteAddress0b + lengthInBytes - 1, 0)
     // val cs = optEncodingName.map { Charset.forName(_) }
-    val decoder = getReplacingDecoder(optEncodingName)
+    val decoder = getReportingDecoder(optEncodingName)
     var i = startByteAddress0b
     val sb = new StringBuilder
     while (i <= endByteAddress0b) {
-      val (cR, _, _) = convertToCharRepr(i - startByteAddress0b, 
endByteAddress0b, byteSource, decoder)
-      sb += cR(0)
-      i += 1
+      val (cR, nBytesConsumed, _) = convertToCharRepr(i - startByteAddress0b, 
endByteAddress0b, byteSource, decoder)
 
 Review comment:
   It looks like we do have code that checks for non byte aligned encodings, 
and we default to per byte decoding based on windows-1252 in those cases. 
Depending on where this is called from, we'll either get text and hex (e.g 
trace output) or text only (e.g left over data dump).
   
   It doesn't look like we have tests for decoding non-8 bit charset encodings.
   
   The update to this code was intended to add multibyte sequence support to 
the textOnly dump as the original code didn't have support for that.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [incubator-daffodil] olabusayoT commented on a change in pull request #254: Adds hex/utf-8 data dump on left over data

Reply via email to