[ 
https://issues.apache.org/jira/browse/HUDI-2320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406374#comment-17406374
 ] 

ASF GitHub Bot commented on HUDI-2320:
--------------------------------------

yanghua commented on a change in pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#discussion_r697977680



##########
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/AvroKafkaSource.java
##########
@@ -91,12 +96,21 @@ public AvroKafkaSource(TypedProperties props, 
JavaSparkContext sparkContext, Spa
       return new InputBatch<>(Option.empty(), 
CheckpointUtils.offsetsToStr(offsetRanges));
     }
     JavaRDD<GenericRecord> newDataRDD = toRDD(offsetRanges);
-    return new InputBatch<>(Option.of(newDataRDD), 
KafkaOffsetGen.CheckpointUtils.offsetsToStr(offsetRanges));
+    return new InputBatch<>(Option.of(newDataRDD), 
CheckpointUtils.offsetsToStr(offsetRanges));
   }
 
   private JavaRDD<GenericRecord> toRDD(OffsetRange[] offsetRanges) {
-    return KafkaUtils.createRDD(sparkContext, offsetGen.getKafkaParams(), 
offsetRanges,
-            LocationStrategies.PreferConsistent()).map(obj -> (GenericRecord) 
obj.value());
+    if (deserializerClassName.equals(ByteArrayDeserializer.class.getName())) {
+      if (schemaProvider == null) {
+        throw new HoodieException("Please provide a valid schema provider 
class!");

Review comment:
       Given more exception information, e.g. : when choose 
`ByteArrayDeserializer `?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


> Add support ByteArrayDeserializer in AvroKafkaSource
> ----------------------------------------------------
>
>                 Key: HUDI-2320
>                 URL: https://issues.apache.org/jira/browse/HUDI-2320
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: DeltaStreamer
>            Reporter: 董可伦
>            Assignee: 董可伦
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.10.0
>
>
> When the 'value.serializer' of Kafka Avro Producer is 
> 'org.apache.kafka.common.serialization.ByteArraySerializer',Use the following 
> configuration
> {code:java}
> --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
> --schemaprovider-class 
> org.apache.hudi.utilities.schema.JdbcbasedSchemaProvider \
> --hoodie-conf 
> "hoodie.deltastreamer.source.kafka.value.deserializer.class=org.apache.kafka.common.serialization.ByteArrayDeserializer"
> {code}
> For now,It will throw an exception::
> {code:java}
> java.lang.ClassCastException: [B cannot be cast to 
> org.apache.avro.generic.GenericRecord{code}
> After support ByteArrayDeserializer,Use the configuration above,It works 
> properly.And there is no need to provide 'schema.registry.url',For example, 
> we can use the JdbcbasedSchemaProvider to get the sourceSchema



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to