[ https://issues.apache.org/jira/browse/GOBBLIN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Azmal Sheik updated GOBBLIN-321: -------------------------------- Description: I was trying to load csv file data to HDFS with below job conf But I'm facing class not found error, I have checked in lib/gobblin-core.jar the class TextFileBasedSource is present but it was saying class not found. Can anyone help over here Here is JOB,LOGS *JOB : * job.name=json-gobblin-hdfs job.group=Gobblin-Json-Demo job.description=Publishing JSON data from files to HDFS in Avro format. job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/ job.lock.enabled=false distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/ source.class=gobblin.source.extractor.filebased.TextFileBasedSource converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter" writer.builder.class=gobblin.writer.AvroDataWriterBuilder source.entity= source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample gobblin.converter.schemaInjector.schema=SCHEMA converter.csv.to.json.delimiter="," extract.table.name=CsvToAvro extract.namespace=gobblin.example extract.table.type=APPEND_ONLY # source data schema source.schema={"namespace":"example.avro", "type":"record", "name":"User", "fields":[{"name":"name", "type":"string"}, {"name":"favorite_number", "type":"int"}, {#"name":"favorite_color", "type":"string"}]} gobblin.converter.schemaInjector.schema=SCHEMA converter.csv.to.json.delimiter="," # quality checker configuration properties qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy qualitychecker.task.policy.types=OPTIONAL,OPTIONAL qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy qualitychecker.row.policy.types=OPTIONAL # data publisher class to be used data.publisher.type=gobblin.publisher.BaseDataPublisher # writer configuration properties writer.destination.type=HDFS writer.output.format=AVRO fs.uri=hdfs://........:8020/ writer.fs.uri=hdfs://.......:8020/ state.store.fs.uri=hdfs://:8020/ mr.job.root.dir=/user/ndxmetadata/output/working state.store.dir=/user/ndxmetadata/output/state-store writer.staging.dir=/user/ndxmetadata/output/task-staging writer.output.dir=/user/ndxmetadata/output/task-output data.publisher.final.dir=/user/ndxmetadata/output/ --------------------------------------------------------------------------- Log's attached below was: I was trying to load csv file data to HDFS with below job conf But I'm facing class not found error, I have checked in lib/gobblin-core.jar the class TextFileBasedSource is present but it was saying class not found. Can anyone help over here Here is JOB,LOGS *JOB : * ###################### job configuration file ###################### job.name=json-gobblin-hdfs job.group=Gobblin-Json-Demo job.description=Publishing JSON data from files to HDFS in Avro format. job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/ job.lock.enabled=false distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/ source.class=gobblin.source.extractor.filebased.TextFileBasedSource converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter" writer.builder.class=gobblin.writer.AvroDataWriterBuilder source.entity= source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample gobblin.converter.schemaInjector.schema=SCHEMA converter.csv.to.json.delimiter="," extract.table.name=CsvToAvro extract.namespace=gobblin.example extract.table.type=APPEND_ONLY # source data schema source.schema={"namespace":"example.avro", "type":"record", "name":"User", "fields":[{"name":"name", "type":"string"}, {"name":"favorite_number", "type":"int"}, {#"name":"favorite_color", "type":"string"}]} gobblin.converter.schemaInjector.schema=SCHEMA converter.csv.to.json.delimiter="," # quality checker configuration properties qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy qualitychecker.task.policy.types=OPTIONAL,OPTIONAL qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy qualitychecker.row.policy.types=OPTIONAL # data publisher class to be used data.publisher.type=gobblin.publisher.BaseDataPublisher # writer configuration properties writer.destination.type=HDFS writer.output.format=AVRO fs.uri=hdfs://........:8020/ writer.fs.uri=hdfs://.......:8020/ state.store.fs.uri=hdfs://:8020/ mr.job.root.dir=/user/ndxmetadata/output/working state.store.dir=/user/ndxmetadata/output/state-store writer.staging.dir=/user/ndxmetadata/output/task-staging writer.output.dir=/user/ndxmetadata/output/task-output data.publisher.final.dir=/user/ndxmetadata/output/ --------------------------------------------------------------------------- Log's attached below > CSV to HDFS ISSUE > ----------------- > > Key: GOBBLIN-321 > URL: https://issues.apache.org/jira/browse/GOBBLIN-321 > Project: Apache Gobblin > Issue Type: Bug > Reporter: Azmal Sheik > Assignee: Joel Baranick > Priority: Critical > Labels: beginner, newbie, starter > Attachments: gobblin-current.log, job.txt > > > I was trying to load csv file data to HDFS with below job conf But I'm facing > class not found error, I have checked in lib/gobblin-core.jar the class > TextFileBasedSource is present but it was saying class not found. > Can anyone help over here > Here is JOB,LOGS > *JOB : > * > job.name=json-gobblin-hdfs > job.group=Gobblin-Json-Demo > job.description=Publishing JSON data from files to HDFS in Avro format. > job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/ > job.lock.enabled=false > distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/ > source.class=gobblin.source.extractor.filebased.TextFileBasedSource > converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter" > writer.builder.class=gobblin.writer.AvroDataWriterBuilder > source.entity= > source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample > gobblin.converter.schemaInjector.schema=SCHEMA > converter.csv.to.json.delimiter="," > extract.table.name=CsvToAvro > extract.namespace=gobblin.example > extract.table.type=APPEND_ONLY > # source data schema > source.schema={"namespace":"example.avro", "type":"record", "name":"User", > "fields":[{"name":"name", "type":"string"}, {"name":"favorite_number", > "type":"int"}, {#"name":"favorite_color", "type":"string"}]} > gobblin.converter.schemaInjector.schema=SCHEMA > converter.csv.to.json.delimiter="," > # quality checker configuration properties > qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy > qualitychecker.task.policy.types=OPTIONAL,OPTIONAL > qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy > qualitychecker.row.policy.types=OPTIONAL > # data publisher class to be used > data.publisher.type=gobblin.publisher.BaseDataPublisher > # writer configuration properties > writer.destination.type=HDFS > writer.output.format=AVRO > fs.uri=hdfs://........:8020/ > writer.fs.uri=hdfs://.......:8020/ > state.store.fs.uri=hdfs://:8020/ > mr.job.root.dir=/user/ndxmetadata/output/working > state.store.dir=/user/ndxmetadata/output/state-store > writer.staging.dir=/user/ndxmetadata/output/task-staging > writer.output.dir=/user/ndxmetadata/output/task-output > data.publisher.final.dir=/user/ndxmetadata/output/ > --------------------------------------------------------------------------- > Log's attached below -- This message was sent by Atlassian JIRA (v6.4.14#64029)