vinothchandar commented on a change in pull request #623: Hudi Test Suite
URL: https://github.com/apache/incubator-hudi/pull/623#discussion_r276876666
 
 

 ##########
 File path: 
hoodie-bench/src/main/java/com/uber/hoodie/integrationsuite/generator/GenericRecordFullPayloadGenerator.java
 ##########
 @@ -0,0 +1,232 @@
+/*
+ *  Copyright (c) 2019 Uber Technologies, Inc. ([email protected])
+ *
+ *  Licensed under the Apache License, Version 2.0 (the "License");
+ *  you may not use this file except in compliance with the License.
+ *  You may obtain a copy of the License at
+ *
+ *           http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ *
+ */
+
+package com.uber.hoodie.integrationsuite.generator;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.uber.hoodie.common.util.collection.Pair;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+import java.util.UUID;
+import org.apache.avro.Schema;
+import org.apache.avro.Schema.Type;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+/**
+ * This is a GenericRecord payload generator that generates full generic 
records {@link GenericRecord}.
+ * Every field of a generic record created using this generator contains a 
random value.
+ */
+public class GenericRecordFullPayloadGenerator implements Serializable {
+
+  public static final int DEFAULT_PAYLOAD_SIZE = 1024 * 1024; // 1 MB
+  private static final int DEFAULT_ENTRIES_FOR_COLLECTIONS = 10;
+  private static Logger log = 
LogManager.getLogger(GenericRecordFullPayloadGenerator.class);
+  protected final Random random = new Random();
+  // The source schema used to generate a payload
+  private final transient Schema baseSchema;
+  // Used to validate a generic record
+  private final transient GenericData genericData = new GenericData();
+  // Number of more bytes to add based on the estimated full record payload 
size and min payload size
+  private int numberOfBytesToAdd;
 
 Review comment:
   so the idea is even if the schema is complex and large we fill up to the 
specified amount of bytes? if so, do we need this kind of complexity? Can't we 
just define a bunch of thin and fat records outside centrally and always 
generate values for each field in the schema? (can support %nulls to test nulls 
as a config if thats a reason)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to