[jira] [Updated] (GOBBLIN-321) CSV to HDFS ISSUE

2018-01-02 Thread Abhishek Tiwari (JIRA)

 [ 
https://issues.apache.org/jira/browse/GOBBLIN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Tiwari updated GOBBLIN-321:

Fix Version/s: 0.12

> CSV to HDFS ISSUE
> -
>
> Key: GOBBLIN-321
> URL: https://issues.apache.org/jira/browse/GOBBLIN-321
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Azmal Sheik
>Assignee: Joel Baranick
>Priority: Critical
>  Labels: beginner, newbie, starter
> Fix For: 0.12
>
> Attachments: gobblin-current.log, job.txt
>
>
> I was trying to load csv file data to HDFS with below job conf But I'm facing 
> class not found error, I have checked in lib/gobblin-core.jar the class 
> TextFileBasedSource is present but it was saying class not found.
> Can anyone help over here
> Here is JOB,LOGS
> *JOB :
> *
> job.name=json-gobblin-hdfs
> job.group=Gobblin-Json-Demo
> job.description=Publishing JSON data from files to HDFS in Avro format.
> job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/
> job.lock.enabled=false
> distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/
> source.class=gobblin.source.extractor.filebased.TextFileBasedSource
> converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter"
> writer.builder.class=gobblin.writer.AvroDataWriterBuilder
> source.entity=
> source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample
> gobblin.converter.schemaInjector.schema=SCHEMA
> converter.csv.to.json.delimiter=","
> extract.table.name=CsvToAvro
> extract.namespace=gobblin.example
> extract.table.type=APPEND_ONLY
> source.schema={"namespace":"example.avro", "type":"record", "name":"User", 
> "fields":[{"name":"name", "type":"string"}, {"name":"favorite_number",  
> "type":"int"}, {#"name":"favorite_color", "type":"string"}]}
> gobblin.converter.schemaInjector.schema=SCHEMA
> converter.csv.to.json.delimiter=","
> qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy
> qualitychecker.task.policy.types=OPTIONAL,OPTIONAL
> qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy
> qualitychecker.row.policy.types=OPTIONAL
> data.publisher.type=gobblin.publisher.BaseDataPublisher
> writer.destination.type=HDFS
> writer.output.format=AVRO
> fs.uri=hdfs://:8020/
> writer.fs.uri=hdfs://...:8020/
> state.store.fs.uri=hdfs://:8020/
> mr.job.root.dir=/user/ndxmetadata/output/working
> state.store.dir=/user/ndxmetadata/output/state-store
> writer.staging.dir=/user/ndxmetadata/output/task-staging
> writer.output.dir=/user/ndxmetadata/output/task-output
> data.publisher.final.dir=/user/ndxmetadata/output/
> ---
> Log's attached below



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (GOBBLIN-321) CSV to HDFS ISSUE

2017-11-22 Thread Azmal Sheik (JIRA)

 [ 
https://issues.apache.org/jira/browse/GOBBLIN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Azmal Sheik updated GOBBLIN-321:

Description: 
I was trying to load csv file data to HDFS with below job conf But I'm facing 
class not found error, I have checked in lib/gobblin-core.jar the class 
TextFileBasedSource is present but it was saying class not found.

Can anyone help over here

Here is JOB,LOGS


*JOB :
*


job.name=json-gobblin-hdfs
job.group=Gobblin-Json-Demo
job.description=Publishing JSON data from files to HDFS in Avro format.


job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/
job.lock.enabled=false
distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/



source.class=gobblin.source.extractor.filebased.TextFileBasedSource
converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter"
writer.builder.class=gobblin.writer.AvroDataWriterBuilder
source.entity=
source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample
gobblin.converter.schemaInjector.schema=SCHEMA
converter.csv.to.json.delimiter=","


extract.table.name=CsvToAvro
extract.namespace=gobblin.example
extract.table.type=APPEND_ONLY

source.schema={"namespace":"example.avro", "type":"record", "name":"User", 
"fields":[{"name":"name", "type":"string"}, {"name":"favorite_number",  
"type":"int"}, {#"name":"favorite_color", "type":"string"}]}




gobblin.converter.schemaInjector.schema=SCHEMA
converter.csv.to.json.delimiter=","

qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy
qualitychecker.task.policy.types=OPTIONAL,OPTIONAL
qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy
qualitychecker.row.policy.types=OPTIONAL
data.publisher.type=gobblin.publisher.BaseDataPublisher

writer.destination.type=HDFS
writer.output.format=AVRO


fs.uri=hdfs://:8020/
writer.fs.uri=hdfs://...:8020/
state.store.fs.uri=hdfs://:8020/


mr.job.root.dir=/user/ndxmetadata/output/working
state.store.dir=/user/ndxmetadata/output/state-store
writer.staging.dir=/user/ndxmetadata/output/task-staging
writer.output.dir=/user/ndxmetadata/output/task-output
data.publisher.final.dir=/user/ndxmetadata/output/


---


Log's attached below




















  was:
I was trying to load csv file data to HDFS with below job conf But I'm facing 
class not found error, I have checked in lib/gobblin-core.jar the class 
TextFileBasedSource is present but it was saying class not found.

Can anyone help over here

Here is JOB,LOGS


*JOB :
*


job.name=json-gobblin-hdfs
job.group=Gobblin-Json-Demo
job.description=Publishing JSON data from files to HDFS in Avro format.


job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/
job.lock.enabled=false
distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/



source.class=gobblin.source.extractor.filebased.TextFileBasedSource
converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter"
writer.builder.class=gobblin.writer.AvroDataWriterBuilder
source.entity=
source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample
gobblin.converter.schemaInjector.schema=SCHEMA
converter.csv.to.json.delimiter=","


extract.table.name=CsvToAvro
extract.namespace=gobblin.example
extract.table.type=APPEND_ONLY



# source data schema
source.schema={"namespace":"example.avro", "type":"record", "name":"User", 
"fields":[{"name":"name", "type":"string"}, {"name":"favorite_number",  
"type":"int"}, {#"name":"favorite_color", "type":"string"}]}




gobblin.converter.schemaInjector.schema=SCHEMA
converter.csv.to.json.delimiter=","

# quality checker configuration properties
qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy
qualitychecker.task.policy.types=OPTIONAL,OPTIONAL
qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy
qualitychecker.row.policy.types=OPTIONAL
# data publisher class to be used
data.publisher.type=gobblin.publisher.BaseDataPublisher

# writer configuration properties
writer.destination.type=HDFS
writer.output.format=AVRO


fs.uri=hdfs://:8020/
writer.fs.uri=hdfs://...:8020/
state.store.fs.uri=hdfs://:8020/


mr.job.root.dir=/user/ndxmetadata/output/working
state.store.dir=/user/ndxmetadata/output/state-store
writer.staging.dir=/user/ndxmetadata/output/task-staging
writer.output.dir=/user/ndxmetadata/output/task-output
data.publisher.final.dir=/user/ndxmetadata/output/


---


Log's attached below





















> CSV to HDFS ISSUE
> -
>
> Key: 

[jira] [Updated] (GOBBLIN-321) CSV to HDFS ISSUE

2017-11-22 Thread Azmal Sheik (JIRA)

 [ 
https://issues.apache.org/jira/browse/GOBBLIN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Azmal Sheik updated GOBBLIN-321:

Description: 
I was trying to load csv file data to HDFS with below job conf But I'm facing 
class not found error, I have checked in lib/gobblin-core.jar the class 
TextFileBasedSource is present but it was saying class not found.

Can anyone help over here

Here is JOB,LOGS


*JOB :
*


job.name=json-gobblin-hdfs
job.group=Gobblin-Json-Demo
job.description=Publishing JSON data from files to HDFS in Avro format.


job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/
job.lock.enabled=false
distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/



source.class=gobblin.source.extractor.filebased.TextFileBasedSource
converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter"
writer.builder.class=gobblin.writer.AvroDataWriterBuilder
source.entity=
source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample
gobblin.converter.schemaInjector.schema=SCHEMA
converter.csv.to.json.delimiter=","


extract.table.name=CsvToAvro
extract.namespace=gobblin.example
extract.table.type=APPEND_ONLY



# source data schema
source.schema={"namespace":"example.avro", "type":"record", "name":"User", 
"fields":[{"name":"name", "type":"string"}, {"name":"favorite_number",  
"type":"int"}, {#"name":"favorite_color", "type":"string"}]}




gobblin.converter.schemaInjector.schema=SCHEMA
converter.csv.to.json.delimiter=","

# quality checker configuration properties
qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy
qualitychecker.task.policy.types=OPTIONAL,OPTIONAL
qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy
qualitychecker.row.policy.types=OPTIONAL
# data publisher class to be used
data.publisher.type=gobblin.publisher.BaseDataPublisher

# writer configuration properties
writer.destination.type=HDFS
writer.output.format=AVRO


fs.uri=hdfs://:8020/
writer.fs.uri=hdfs://...:8020/
state.store.fs.uri=hdfs://:8020/


mr.job.root.dir=/user/ndxmetadata/output/working
state.store.dir=/user/ndxmetadata/output/state-store
writer.staging.dir=/user/ndxmetadata/output/task-staging
writer.output.dir=/user/ndxmetadata/output/task-output
data.publisher.final.dir=/user/ndxmetadata/output/


---


Log's attached below




















  was:
I was trying to load csv file data to HDFS with below job conf But I'm facing 
class not found error, I have checked in lib/gobblin-core.jar the class 
TextFileBasedSource is present but it was saying class not found.

Can anyone help over here

Here is JOB,LOGS


*JOB :
*


job.name=json-gobblin-hdfs
job.group=Gobblin-Json-Demo
job.description=Publishing JSON data from files to HDFS in Avro format.


job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/
job.lock.enabled=false
distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/



source.class=gobblin.source.extractor.filebased.TextFileBasedSource
converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter"
writer.builder.class=gobblin.writer.AvroDataWriterBuilder

source.entity=
source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample
gobblin.converter.schemaInjector.schema=SCHEMA
converter.csv.to.json.delimiter=","


extract.table.name=CsvToAvro
extract.namespace=gobblin.example
extract.table.type=APPEND_ONLY



# source data schema
source.schema={"namespace":"example.avro", "type":"record", "name":"User", 
"fields":[{"name":"name", "type":"string"}, {"name":"favorite_number",  
"type":"int"}, {#"name":"favorite_color", "type":"string"}]}




gobblin.converter.schemaInjector.schema=SCHEMA
converter.csv.to.json.delimiter=","

# quality checker configuration properties
qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy
qualitychecker.task.policy.types=OPTIONAL,OPTIONAL
qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy
qualitychecker.row.policy.types=OPTIONAL
# data publisher class to be used
data.publisher.type=gobblin.publisher.BaseDataPublisher

# writer configuration properties
writer.destination.type=HDFS
writer.output.format=AVRO


fs.uri=hdfs://:8020/
writer.fs.uri=hdfs://...:8020/
state.store.fs.uri=hdfs://:8020/


mr.job.root.dir=/user/ndxmetadata/output/working
state.store.dir=/user/ndxmetadata/output/state-store
writer.staging.dir=/user/ndxmetadata/output/task-staging
writer.output.dir=/user/ndxmetadata/output/task-output
data.publisher.final.dir=/user/ndxmetadata/output/



[jira] [Updated] (GOBBLIN-321) CSV to HDFS ISSUE

2017-11-22 Thread Azmal Sheik (JIRA)

 [ 
https://issues.apache.org/jira/browse/GOBBLIN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Azmal Sheik updated GOBBLIN-321:

Description: 
I was trying to load csv file data to HDFS with below job conf But I'm facing 
class not found error, I have checked in lib/gobblin-core.jar the class 
TextFileBasedSource is present but it was saying class not found.

Can anyone help over here

Here is JOB,LOGS


*JOB :
*


job.name=json-gobblin-hdfs
job.group=Gobblin-Json-Demo
job.description=Publishing JSON data from files to HDFS in Avro format.


job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/
job.lock.enabled=false
distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/



source.class=gobblin.source.extractor.filebased.TextFileBasedSource
converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter"
writer.builder.class=gobblin.writer.AvroDataWriterBuilder

source.entity=
source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample
gobblin.converter.schemaInjector.schema=SCHEMA
converter.csv.to.json.delimiter=","


extract.table.name=CsvToAvro
extract.namespace=gobblin.example
extract.table.type=APPEND_ONLY



# source data schema
source.schema={"namespace":"example.avro", "type":"record", "name":"User", 
"fields":[{"name":"name", "type":"string"}, {"name":"favorite_number",  
"type":"int"}, {#"name":"favorite_color", "type":"string"}]}




gobblin.converter.schemaInjector.schema=SCHEMA
converter.csv.to.json.delimiter=","

# quality checker configuration properties
qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy
qualitychecker.task.policy.types=OPTIONAL,OPTIONAL
qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy
qualitychecker.row.policy.types=OPTIONAL
# data publisher class to be used
data.publisher.type=gobblin.publisher.BaseDataPublisher

# writer configuration properties
writer.destination.type=HDFS
writer.output.format=AVRO


fs.uri=hdfs://:8020/
writer.fs.uri=hdfs://...:8020/
state.store.fs.uri=hdfs://:8020/


mr.job.root.dir=/user/ndxmetadata/output/working
state.store.dir=/user/ndxmetadata/output/state-store
writer.staging.dir=/user/ndxmetadata/output/task-staging
writer.output.dir=/user/ndxmetadata/output/task-output
data.publisher.final.dir=/user/ndxmetadata/output/


---


Log's attached below




















  was:
I was trying to load csv file data to HDFS with below job conf But I'm facing 
class not found error, I have checked in lib/gobblin-core.jar the class 
TextFileBasedSource is present but it was saying class not found.

Can anyone help over here

Here is JOB,LOGS


*JOB :
*

## job configuration file ##


job.name=json-gobblin-hdfs
job.group=Gobblin-Json-Demo
job.description=Publishing JSON data from files to HDFS in Avro format.


job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/
job.lock.enabled=false
distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/



source.class=gobblin.source.extractor.filebased.TextFileBasedSource
converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter"
writer.builder.class=gobblin.writer.AvroDataWriterBuilder

source.entity=
source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample
gobblin.converter.schemaInjector.schema=SCHEMA
converter.csv.to.json.delimiter=","


extract.table.name=CsvToAvro
extract.namespace=gobblin.example
extract.table.type=APPEND_ONLY



# source data schema
source.schema={"namespace":"example.avro", "type":"record", "name":"User", 
"fields":[{"name":"name", "type":"string"}, {"name":"favorite_number",  
"type":"int"}, {#"name":"favorite_color", "type":"string"}]}




gobblin.converter.schemaInjector.schema=SCHEMA
converter.csv.to.json.delimiter=","

# quality checker configuration properties
qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy
qualitychecker.task.policy.types=OPTIONAL,OPTIONAL
qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy
qualitychecker.row.policy.types=OPTIONAL
# data publisher class to be used
data.publisher.type=gobblin.publisher.BaseDataPublisher

# writer configuration properties
writer.destination.type=HDFS
writer.output.format=AVRO


fs.uri=hdfs://:8020/
writer.fs.uri=hdfs://...:8020/
state.store.fs.uri=hdfs://:8020/


mr.job.root.dir=/user/ndxmetadata/output/working
state.store.dir=/user/ndxmetadata/output/state-store
writer.staging.dir=/user/ndxmetadata/output/task-staging
writer.output.dir=/user/ndxmetadata/output/task-output

[jira] [Updated] (GOBBLIN-321) CSV to HDFS ISSUE

2017-11-22 Thread Azmal Sheik (JIRA)

 [ 
https://issues.apache.org/jira/browse/GOBBLIN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Azmal Sheik updated GOBBLIN-321:

Attachment: (was: job)

> CSV to HDFS ISSUE
> -
>
> Key: GOBBLIN-321
> URL: https://issues.apache.org/jira/browse/GOBBLIN-321
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Azmal Sheik
>Priority: Critical
>  Labels: beginner, newbie, starter
> Attachments: gobblin-current.log, job.txt
>
>
> I was trying to load csv file data to HDFS with below job conf But I'm facing 
> class not found error, I have checked in lib/gobblin-core.jar the class 
> TextFileBasedSource is present but it was saying class not found.
> Can anyone help over here
> Here is JOB,LOGS
> *JOB :
> *
> ## job configuration file ##
> job.name=json-gobblin-hdfs
> job.group=Gobblin-Json-Demo
> job.description=Publishing JSON data from files to HDFS in Avro format.
> job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/
> job.lock.enabled=false
> distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/
> source.class=gobblin.source.extractor.filebased.TextFileBasedSource
> converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter"
> writer.builder.class=gobblin.writer.AvroDataWriterBuilder
> source.entity=
> source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample
> gobblin.converter.schemaInjector.schema=SCHEMA
> converter.csv.to.json.delimiter=","
> extract.table.name=CsvToAvro
> extract.namespace=gobblin.example
> extract.table.type=APPEND_ONLY
> # source data schema
> source.schema={"namespace":"example.avro", "type":"record", "name":"User", 
> "fields":[{"name":"name", "type":"string"}, {"name":"favorite_number",  
> "type":"int"}, {#"name":"favorite_color", "type":"string"}]}
> gobblin.converter.schemaInjector.schema=SCHEMA
> converter.csv.to.json.delimiter=","
> # quality checker configuration properties
> qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy
> qualitychecker.task.policy.types=OPTIONAL,OPTIONAL
> qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy
> qualitychecker.row.policy.types=OPTIONAL
> # data publisher class to be used
> data.publisher.type=gobblin.publisher.BaseDataPublisher
> # writer configuration properties
> writer.destination.type=HDFS
> writer.output.format=AVRO
> fs.uri=hdfs://:8020/
> writer.fs.uri=hdfs://...:8020/
> state.store.fs.uri=hdfs://:8020/
> mr.job.root.dir=/user/ndxmetadata/output/working
> state.store.dir=/user/ndxmetadata/output/state-store
> writer.staging.dir=/user/ndxmetadata/output/task-staging
> writer.output.dir=/user/ndxmetadata/output/task-output
> data.publisher.final.dir=/user/ndxmetadata/output/
> ---
> Log's attached below



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (GOBBLIN-321) CSV to HDFS ISSUE

2017-11-22 Thread Azmal Sheik (JIRA)

 [ 
https://issues.apache.org/jira/browse/GOBBLIN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Azmal Sheik updated GOBBLIN-321:

Attachment: job.txt

> CSV to HDFS ISSUE
> -
>
> Key: GOBBLIN-321
> URL: https://issues.apache.org/jira/browse/GOBBLIN-321
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Azmal Sheik
>Priority: Critical
>  Labels: beginner, newbie, starter
> Attachments: gobblin-current.log, job.txt
>
>
> I was trying to load csv file data to HDFS with below job conf But I'm facing 
> class not found error, I have checked in lib/gobblin-core.jar the class 
> TextFileBasedSource is present but it was saying class not found.
> Can anyone help over here
> Here is JOB,LOGS
> *JOB :
> *
> ## job configuration file ##
> job.name=json-gobblin-hdfs
> job.group=Gobblin-Json-Demo
> job.description=Publishing JSON data from files to HDFS in Avro format.
> job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/
> job.lock.enabled=false
> distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/
> source.class=gobblin.source.extractor.filebased.TextFileBasedSource
> converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter"
> writer.builder.class=gobblin.writer.AvroDataWriterBuilder
> source.entity=
> source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample
> gobblin.converter.schemaInjector.schema=SCHEMA
> converter.csv.to.json.delimiter=","
> extract.table.name=CsvToAvro
> extract.namespace=gobblin.example
> extract.table.type=APPEND_ONLY
> # source data schema
> source.schema={"namespace":"example.avro", "type":"record", "name":"User", 
> "fields":[{"name":"name", "type":"string"}, {"name":"favorite_number",  
> "type":"int"}, {#"name":"favorite_color", "type":"string"}]}
> gobblin.converter.schemaInjector.schema=SCHEMA
> converter.csv.to.json.delimiter=","
> # quality checker configuration properties
> qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy
> qualitychecker.task.policy.types=OPTIONAL,OPTIONAL
> qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy
> qualitychecker.row.policy.types=OPTIONAL
> # data publisher class to be used
> data.publisher.type=gobblin.publisher.BaseDataPublisher
> # writer configuration properties
> writer.destination.type=HDFS
> writer.output.format=AVRO
> fs.uri=hdfs://:8020/
> writer.fs.uri=hdfs://...:8020/
> state.store.fs.uri=hdfs://:8020/
> mr.job.root.dir=/user/ndxmetadata/output/working
> state.store.dir=/user/ndxmetadata/output/state-store
> writer.staging.dir=/user/ndxmetadata/output/task-staging
> writer.output.dir=/user/ndxmetadata/output/task-output
> data.publisher.final.dir=/user/ndxmetadata/output/
> ---
> Log's attached below



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (GOBBLIN-321) CSV to HDFS ISSUE

2017-11-22 Thread Azmal Sheik (JIRA)

 [ 
https://issues.apache.org/jira/browse/GOBBLIN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Azmal Sheik updated GOBBLIN-321:

Attachment: job

> CSV to HDFS ISSUE
> -
>
> Key: GOBBLIN-321
> URL: https://issues.apache.org/jira/browse/GOBBLIN-321
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Azmal Sheik
>Priority: Critical
>  Labels: beginner, newbie, starter
> Attachments: gobblin-current.log, job
>
>
> I was trying to load csv file data to HDFS with below job conf But I'm facing 
> class not found error, I have checked in lib/gobblin-core.jar the class 
> TextFileBasedSource is present but it was saying class not found.
> Can anyone help over here
> Here is JOB,LOGS
> *JOB :
> *
> ## job configuration file ##
> job.name=json-gobblin-hdfs
> job.group=Gobblin-Json-Demo
> job.description=Publishing JSON data from files to HDFS in Avro format.
> job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/
> job.lock.enabled=false
> distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/
> source.class=gobblin.source.extractor.filebased.TextFileBasedSource
> converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter"
> writer.builder.class=gobblin.writer.AvroDataWriterBuilder
> source.entity=
> source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample
> gobblin.converter.schemaInjector.schema=SCHEMA
> converter.csv.to.json.delimiter=","
> extract.table.name=CsvToAvro
> extract.namespace=gobblin.example
> extract.table.type=APPEND_ONLY
> # source data schema
> source.schema={"namespace":"example.avro", "type":"record", "name":"User", 
> "fields":[{"name":"name", "type":"string"}, {"name":"favorite_number",  
> "type":"int"}, {#"name":"favorite_color", "type":"string"}]}
> gobblin.converter.schemaInjector.schema=SCHEMA
> converter.csv.to.json.delimiter=","
> # quality checker configuration properties
> qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy
> qualitychecker.task.policy.types=OPTIONAL,OPTIONAL
> qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy
> qualitychecker.row.policy.types=OPTIONAL
> # data publisher class to be used
> data.publisher.type=gobblin.publisher.BaseDataPublisher
> # writer configuration properties
> writer.destination.type=HDFS
> writer.output.format=AVRO
> fs.uri=hdfs://:8020/
> writer.fs.uri=hdfs://...:8020/
> state.store.fs.uri=hdfs://:8020/
> mr.job.root.dir=/user/ndxmetadata/output/working
> state.store.dir=/user/ndxmetadata/output/state-store
> writer.staging.dir=/user/ndxmetadata/output/task-staging
> writer.output.dir=/user/ndxmetadata/output/task-output
> data.publisher.final.dir=/user/ndxmetadata/output/
> ---
> Log's attached below



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)