[
https://issues.apache.org/jira/browse/AVRO-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17654471#comment-17654471
]
ASF subversion and git services commented on AVRO-3611:
-------------------------------------------------------
Commit bf8cde0f661c2d11a770c4c3343ad19f21a5388c in avro's branch
refs/heads/master from Christophe Le Saec
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=bf8cde0f6 ]
AVRO-3611: add constants
> org.apache.avro.util.RandomData generates invalid test data
> -----------------------------------------------------------
>
> Key: AVRO-3611
> URL: https://issues.apache.org/jira/browse/AVRO-3611
> Project: Apache Avro
> Issue Type: Improvement
> Components: java
> Affects Versions: 1.11.1
> Reporter: Simon Klakegg
> Assignee: Christophe Le Saec
> Priority: Minor
> Labels: features, pull-request-available
> Fix For: 1.11.2
>
> Attachments: image-2022-08-18-19-05-37-323.png
>
> Original Estimate: 48h
> Time Spent: 1h 10m
> Remaining Estimate: 46h 50m
>
> When RandomData.java generates data it does not check for Logical Types,
> which are described here: [Specification | Apache
> Avro|https://avro.apache.org/docs/1.11.1/specification/_print/]
> *For instance the following generate method would return this for INT fields:*
> {code:java}
> case INT: return random.nextInt(); {code}
>
> {*}However, an int field could be of logical type
> date:{*}!image-2022-08-18-19-05-37-323.png|width=1052,height=266!
>
> Which in many cases could create an int that is out of range for logicalType
> Date, and thus break when creating records in for instance kafka.
> My suggestion is to generated data that is valid for logicalTypes, here is an
> example I made for int and long:
> {code:java}
> case INT:
> switch (logicalTypeName) {
> case "date":
> // Random number of days between Unix Epoch start day (0) and end day
> (24855)
> int maxDaysInEpoch = (int)
> Duration.ofSeconds(Integer.MAX_VALUE).toDays();
> return ThreadLocalRandom.current().nextInt(0, maxDaysInEpoch);
> case "time-millis":
> // Random number of milliseconds between midnight 00:00:00.000 (0)
> and 23:59:59:999 (86399999)
> int maxMillisecondsInDay = (int) Duration.ofDays(1).toMillis() - 1;
> return random.nextInt(0, maxMillisecondsInDay);
> default: return random.nextInt();
> }
> case LONG:
> switch (logicalTypeName) {
> case "time-micros":
> // Random number of microseconds between midnight 00:00:00.000000 (0)
> and 23:59:59:999999 (86399999999)
> long maxMicrosecondsInDay = (Duration.ofDays(1).toNanos() - 1) / 1000;
> return random.nextLong(0, maxMicrosecondsInDay);
> case "timestamp-millis":
> // Random milliseconds between Unix Epoch (0) start and end
> (2147483647000)
> long maxMillisecondsInEpoch =
> TimeUnit.SECONDS.toMillis(Integer.MAX_VALUE);
> return ThreadLocalRandom.current().nextLong(0, maxMillisecondsInEpoch);
> case "timestamp-micros":
> // Random microseconds between Unix Epoch (0) start and end
> (2147483647000000)
> long maxMicrosecondsInEpoch =
> TimeUnit.SECONDS.toMicros(Integer.MAX_VALUE);
> return ThreadLocalRandom.current().nextLong(0, maxMicrosecondsInEpoch);
> case "local-timestamp-millis":
> // Random number of milliseconds between Unix Epoch start (0) and 100
> years from now (now() + 100)
> ZonedDateTime hundredYearsFromNow = ZonedDateTime.now().plusYears(100);
> long hundredYearsEpochMillis = ChronoUnit.MILLIS.between(Instant.EPOCH,
> hundredYearsFromNow);
> return random.nextLong(0, hundredYearsEpochMillis);
> case "local-timestamp-micros":
> // Random number of microseconds between Unix Epoch start (0) and 100
> years from now (now() + 100)
> long hundredYearsEpochMicros = ChronoUnit.MICROS.between(Instant.EPOCH,
> hundredYearsFromNow);
> return random.nextLong(0, hundredYearsEpochMicros);
> default: return random.nextLong();
> } {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)