[
https://issues.apache.org/jira/browse/BEAM-5807?focusedWorklogId=157183&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-157183
]
ASF GitHub Bot logged work on BEAM-5807:
----------------------------------------
Author: ASF GitHub Bot
Created on: 22/Oct/18 20:42
Start Date: 22/Oct/18 20:42
Worklog Time Spent: 10m
Work Description: kanterov commented on a change in pull request #6777:
[BEAM-5807] Conversion from AVRO records to rows
URL: https://github.com/apache/beam/pull/6777#discussion_r227107978
##########
File path:
sdks/java/core/src/test/java/org/apache/beam/sdk/util/AvroUtilsTest.java
##########
@@ -0,0 +1,114 @@
+package org.apache.beam.sdk.util;
+
+import static org.hamcrest.Matchers.not;
+import static org.junit.Assume.assumeThat;
+
+import com.google.common.collect.Lists;
+import com.pholser.junit.quickcheck.From;
+import com.pholser.junit.quickcheck.Property;
+import com.pholser.junit.quickcheck.runner.JUnitQuickcheck;
+import java.util.List;
+import java.util.function.Function;
+import org.apache.avro.RandomData;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.schemas.utils.AvroUtils;
+import org.apache.beam.sdk.util.AvroGenerators.RecordSchemaGenerator;
+import org.hamcrest.BaseMatcher;
+import org.hamcrest.Description;
+import org.junit.runner.RunWith;
+
+/** Tests for conversion between AVRO records and Beam rows. */
+@RunWith(JUnitQuickcheck.class)
+public class AvroUtilsTest {
+
+ @Property
+ @SuppressWarnings("unchecked")
+ public void supportsAnyAvroSchema(
+ @From(RecordSchemaGenerator.class) org.apache.avro.Schema avroSchema) {
+ // not everything is possible to translate
+ assumeThat(avroSchema,
not(containsField(AvroUtilsTest::hasArrayOfNullable)));
+ assumeThat(avroSchema, not(containsField(AvroUtilsTest::hasNonNullUnion)));
+
+ Schema schema = AvroUtils.toSchema(avroSchema);
+ Iterable iterable = new RandomData(avroSchema, 10);
+ List<GenericRecord> records = Lists.newArrayList((Iterable<GenericRecord>)
iterable);
+
+ for (GenericRecord record : records) {
+ AvroUtils.toRowStrict(record, schema);
Review comment:
I'm thinking of adding assertions when the `Row => GenericData` is ready.
Assertion would be `fromRow(toRow(x)) == x`
As well as few unit tests in a different PR.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 157183)
Time Spent: 1.5h (was: 1h 20m)
> Add AvroIO.readRows
> -------------------
>
> Key: BEAM-5807
> URL: https://issues.apache.org/jira/browse/BEAM-5807
> Project: Beam
> Issue Type: Improvement
> Components: dsl-sql
> Reporter: Gleb Kanterov
> Assignee: Gleb Kanterov
> Priority: Major
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> # Motivation
> At the moment the only way to read AVRO is through code generation with
> avro-compiler and JavaBeanSchema. It makes it not possible to write
> transforms that can work with dynamic schemas. AVRO has generic data type
> called GenericRecord, reading is implemented in AvroIO.
> readGenericRecords. There is a code to convert GenericRecord to Row shipped
> as a part of BigQueryIO. However, it doesn't support all types and nested
> records.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)