Github user mosermw commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1469#discussion_r99244109
  
    --- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/SchemaRecordWriter.java
 ---
    @@ -136,4 +144,44 @@ private void writeFieldValue(final RecordField field, 
final Object value, final
                     break;
             }
         }
    +
    +    private void writeUTFLimited(final DataOutputStream out, final String 
utfString) throws IOException {
    +        try {
    +            out.writeUTF(utfString);
    +        } catch (UTFDataFormatException e) {
    +            final String truncated = utfString.substring(0, 
getCharsInUTFLength(utfString, MAX_ALLOWED_UTF_LENGTH));
    +            logger.warn("Truncating UTF value!  Attempted to write string 
with char length {} and UTF length greater than "
    +                            + "supported maximum allowed ({}), truncating 
to char length {}.",
    +                    utfString.length(), MAX_ALLOWED_UTF_LENGTH, 
truncated.length());
    --- End diff --
    
    Can we mention provenance in this message, such as "Truncating provenance 
record value"?  Does this message potentially mix char length and byte length, 
such as "Attempted to write string with char length 40000 and UTF length 
greater than supported maximum allowed (65535), truncating to char length 
39000."?  Perhaps a simpler message such as "Attempted to store string with 
length 40000, truncating to 39000."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to