Yes, It makes sense to add validations with descriptive messages. Please open 
a ticket and send a PR for this.
Thanks,Balaji.V    On Monday, September 16, 2019, 01:11:12 AM PDT, Pratyaksh 
Sharma <[email protected]> wrote:  
 
 Hi Balaji,

I get your point. However I feel in such cases, instead of throwing a Null
Pointer, we should handle the case gracefully. The exception should be
thrown with proper user-facing message. Please let me know your thoughts
on this.

On Fri, Sep 13, 2019 at 7:26 PM Balaji Varadarajan
<[email protected]> wrote:

>  Hi Pratyaksh,
> This is expected. You need to pass a schema-provider since you are using
> Avro Sources.For RowBased sources, DeltaStreamer can deduce schema from Row
> type information available from Spark Dataset.
> Balaji.V
>    On Friday, September 13, 2019, 02:57:37 AM PDT, Pratyaksh Sharma <
> [email protected]> wrote:
>
>  Hi,
>
> I am trying to build a CDC pipeline using Hudi working on tag hoodie-0.4.7.
> Here is the command I used for running DeltaStreamer -
>
> spark-submit --files jaas.conf --conf
> 'spark.driver.extraJavaOptions=-Djava.security.auth.login.config=jaas.conf'
> --conf
>
> 'spark.executor.extraJavaOptions=-Djava.security.auth.login.config=jaas.conf'
> --master yarn --deploy-mode cluster --num-executors 2 --class
> com.uber.hoodie.utilities.deltastreamer.HoodieDeltaStreamer
> /path/to/hoodie-utilities-0.4.7.jar --storage-type COPY_ON_WRITE
> --source-class com.uber.hoodie.utilities.sources.AvroKafkaSource
> --source-ordering-field xxxx --target-base-path hdfs://path/to/cow_table
> --target-table cow_table --props hdfs://path/to/fg-kafka-source.properties
> --transformer-class com.uber.hoodie.utilities.transform.DebeziumTransformer
> --spark-master yarn-cluster --source-limit 5000
>
> Basically I have not passed any SchemaProvider class in the command. When I
> run the above command, I get the below exception in SourceFormatAdapter and
> the job gets killed -
>
> java.lang.NullPointerException
> at
>
> com.uber.hoodie.utilities.deltastreamer.SourceFormatAdapter.fetchNewDataInRowFormat(SourceFormatAdapter.java:94)
> at
>
> com.uber.hoodie.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:224)
> at
>
> com.uber.hoodie.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:504)
>
> In HoodieDeltaStreamer class, we try to initiate RowBasedSchemaProvider
> before registering Avro Schemas if the schemaProvider variable is null.
> Hence I am trying to understand if the above exception is expected
> behaviour.
>
> Please help.
>
  

Reply via email to