[ https://issues.apache.org/jira/browse/PARQUET-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678918#comment-17678918 ]
ASF GitHub Bot commented on PARQUET-2231: ----------------------------------------- pitrou commented on code in PR #189: URL: https://github.com/apache/parquet-format/pull/189#discussion_r1081911430 ########## Encodings.md: ########## @@ -299,9 +302,18 @@ For a longer description, see https://en.wikipedia.org/wiki/Incremental_encoding This is stored as a sequence of delta-encoded prefix lengths (DELTA_BINARY_PACKED), followed by the suffixes encoded as delta length byte arrays (DELTA_LENGTH_BYTE_ARRAY). +For example, if the data was "axis", "axle", "babble", "babyhood": + +The encoded data would be comprised of the following segments: Review Comment: ```suggestion For example, if the data was "axis", "axle", "babble", "babyhood" then the encoded data would be comprised of the following segments: ``` > [Format] Encoding spec incorrect for DELTA_BYTE_ARRAY > ----------------------------------------------------- > > Key: PARQUET-2231 > URL: https://issues.apache.org/jira/browse/PARQUET-2231 > Project: Parquet > Issue Type: Bug > Components: parquet-format > Reporter: Antoine Pitrou > Assignee: Antoine Pitrou > Priority: Critical > Fix For: format-2.10.0 > > > The spec says that DELTA_BYTE_ARRAY is only supported for BYTE_ARRAY, but in > parquet-mr it has been allowed for FIXED_LEN_BYTE_ARRAY as well since 2015. -- This message was sent by Atlassian Jira (v8.20.10#820010)