harishraju-govindaraju edited a comment on issue #4641:
URL: https://github.com/apache/hudi/issues/4641#issuecomment-1017122913
Hello @nsivabalan ,
Thanks for promptly responding to my question.
I tried to clear the folder and reran the below spark-submit command. The
folder .hoodie got created but the job ended with error with no data files.
**_Unrecognized token 'Objavro': was expecting (JSON String, Number, Array,
Object or token 'null', 'true' or 'false')
at [Source:
(String)"Objavro.schema�{"type":"record","name":"topLevelRecord","fields":[{"name":"id","type":["string","null"]},{"name":"creation_date","type":["string","null"]},{"name":"last_update_time","type":["string","null"]},{"name":"quantity","type":["string","null"]},{"name":"compcode","type":["string","null"]}]}0org.apache.spark.version";
line: 1, column: 11]_**
spark-submit \
--jars "s3://zcustomjar/spark-avro_2.11-2.4.4.jar" \
--deploy-mode "client" \
--class "org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer"
/usr/lib/hudi/hudi-utilities-bundle.jar \
--schemaprovider-class
"org.apache.hudi.utilities.schema.FilebasedSchemaProvider" \
--table-type COPY_ON_WRITE \
--source-ordering-field id \
--target-base-path s3://ztrusted1/default/hudi-table1/ --target-table
hudi-table1 \
--hoodie-conf
hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator
\
--hoodie-conf hoodie.datasource.write.recordkey.field=id \
--hoodie-conf hoodie.deltastreamer.source.dfs.root=s3://zlanding1/input1/ \
--hoodie-conf hoodie.datasource.write.partitionpath.field=compcode \
--hoodie-conf hoodie.datasource.write.operation=insert \
--hoodie-conf
hoodie.deltastreamer.schemaprovider.source.schema.file=s3://zcustomjar/source2.avsc
\
--hoodie-conf
hoodie.deltastreamer.schemaprovider.target.schema.file=s3://zcustomjar/target.avsc
\
I have manually created the schema .avsc file using notepad. Not sure if
that is a problem.
{
"type" : "record",
"name" : "triprec",
"fields" : [
{
"name" : "id",
"type" : "string"
}, {
"name" : "creation_date",
"type" : "string"
}, {
"name" : "last_update_time",
"type" : "string"
}, {
"name" : "quantity",
"type" : "string"
}, {
"name" : "compcode",
"type" : "string"
}]
}
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]