[
https://issues.apache.org/jira/browse/PARQUET-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17101484#comment-17101484
]
ASF GitHub Bot commented on PARQUET-1827:
-----------------------------------------
gszadovszky commented on a change in pull request #778:
URL: https://github.com/apache/parquet-mr/pull/778#discussion_r421381112
##########
File path:
parquet-column/src/test/java/org/apache/parquet/schema/TestPrimitiveStringifier.java
##########
@@ -309,6 +308,35 @@ public void testDecimalStringifier() {
checkThrowingUnsupportedException(stringifier, Integer.TYPE, Long.TYPE,
Binary.class);
}
+ @Test
+ public void testUUIDStringifier() {
+ PrimitiveStringifier stringifier = PrimitiveStringifier.UUID_STRINGIFIER;
+
+ assertEquals("00112233-4455-6677-8899-aabbccddeeff", stringifier.stringify(
+ toBinary(0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99,
0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff)));
+ assertEquals("00000000-0000-0000-0000-000000000000", stringifier.stringify(
+ toBinary(0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00)));
+ assertEquals("ffffffff-ffff-ffff-ffff-ffffffffffff", stringifier.stringify(
+ toBinary(0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
0xff, 0xff, 0xff, 0xff, 0xff, 0xff)));
+
+ assertEquals("0eb1497c-19b6-42bc-b028-b4b612bed141", stringifier.stringify(
Review comment:
The idea is to have 3 kind-of corner cases and 3 common (random but
constant) values. What do you mean by duplicate coverage? (I think, we do not
need exhaustive testing for the stringifiers because they only used by our
tools for debugging purposes.)
The stringifiers do not validate the data they get because of performance
reasons. So, if the array is longer than 16 it would simply stringify the first
16 and skip the others. In case of the length is too short then an
`ArrayIndexOutOfBoundsException` would be thrown. Do you think we should test
these cases? They would not reach any additional branches in the parquet code.
Invalid characters are not possible. The full set of values of the 16 bytes
array is covered in UUID.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> UUID type currently not supported by parquet-mr
> -----------------------------------------------
>
> Key: PARQUET-1827
> URL: https://issues.apache.org/jira/browse/PARQUET-1827
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-mr
> Affects Versions: 1.11.0
> Reporter: Brad Smith
> Assignee: Gabor Szadovszky
> Priority: Major
> Labels: pull-request-available
>
> The parquet-format project introduced a new UUID logical type in version 2.4:
> [https://github.com/apache/parquet-format/blob/master/CHANGES.md]
> This would be a useful type to have available in some circumstances, but it
> currently isn't supported in the parquet-mr library. Hopefully this feature
> can be implemented at some point.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)