[
https://issues.apache.org/jira/browse/AVRO-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204876#comment-17204876
]
Ryan Skraba commented on AVRO-2934:
-----------------------------------
Hello! There's a
[RandomData|https://github.com/apache/avro/blob/e208f4b2d442bc14aaba3dad86e8122b83a0873c/lang/java/avro/src/main/java/org/apache/avro/util/RandomData.java]
that can be used to create pseudo-random data.
{{RandomData}} is an {{Iterable}} so it's pretty easy to use to create large
collections, deterministically if you give it a seed.
{code}
// Create 5000 records that correspond to the given schema using the seed 0
for (Object datum : new RandomData(myRecordSchema, 5000, 0L)) {
// e.g., datum will be a GenericRecord if myRecordSchema is a
Schema.Type.RECORD
....
}
{code}
The rules for generating the data is hard-coded in the generating class, and
it's _OK_ but inflexible. If you have any propositions to improve the
generating functions via annotations, it could be an interesting improvement!
> Initialise all fields in a nested schema
> ----------------------------------------
>
> Key: AVRO-2934
> URL: https://issues.apache.org/jira/browse/AVRO-2934
> Project: Apache Avro
> Issue Type: Improvement
> Components: java
> Reporter: Biliuta
> Priority: Minor
>
> For testing purposes it would be nice to have a way to initialise all fields
> to some value even if there is no default value specified in the schema (the
> value is required). I noticed that for schemas that are large and have a few
> levels of nesting it can get quite ugly (creating all the required sub
> classes) when you want to instantiate a random message to do some tests.
> The possible data types in an avro schema are initialisable to some
> default/random value and if this is not the value desired, it can be changed
> at any time.
> I did a short implementation using reflection that recursively goes through
> the entire fields of a message but maybe an annotation included in the avro
> schema (using javaAnnotation) would make more sense so that it is available
> only if needed. The annotation could also include some options like default
> or random value, overwrite existing non null members or not, ignore specific
> members or types.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)