[ 
https://issues.apache.org/jira/browse/GOBBLIN-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arjun Singh Bora updated GOBBLIN-48:
------------------------------------
    Sprint: Apache Gobblin 170807

> java.lang.IllegalArgumentException when using extract.limit.enabled
> -------------------------------------------------------------------
>
>                 Key: GOBBLIN-48
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-48
>             Project: Apache Gobblin
>          Issue Type: Bug
>            Reporter: Abhishek Tiwari
>            Assignee: Arjun Singh Bora
>
> Hi,
> I'm trying to set a time limit on Gobblin when extracting data from kafka,
> my job config looks like this:
> ```
> job.name=events
> job.group=events
> job.description=events
> job.lock.enabled=true
> job.schedule=0 0 0/1 1/1 * ? *
> kafka.brokers=${env:KAFKA_BROKERS}
> kafka.workunit.packer.type=BI_LEVEL
> source.class=gobblin.source.extractor.extract.kafka.KafkaSimpleSource
> extract.namespace=gobblin.extract.kafka
> extract.limit.enabled=true
> extract.limit.type=time
> extract.limit.time.limit=120
> extract.limit.time.limit.timeunit=minutes
> converter.classes=com.custom.gobblin.GobblinBytesToAvroConverter
> writer.partitioner.class=com.custom.gobblin.GobblinDailyPartitioner
> writer.partition.pattern=YYYY/MM/dd
> writer.partition.columns=timestamp
> writer.builder.class=gobblin.writer.AvroDataWriterBuilder
> writer.file.path.type=tablename
> writer.partition.timezone=UTC
> writer.destination.type=HDFS
> writer.output.format=AVRO
> writer.codec.type=SNAPPY
> writer.deflate.level=6
> writer.include.partition.in.file.names=false
> mr.job.max.mappers=5
> topic.whitelist=.*
> data.publisher.type=gobblin.publisher.TimePartitionedDataPublisher
> metrics.reporting.file.enabled=true
> metrics.log.dir=${env:GOBBLIN_WORK_DIR}/metrics
> metrics.reporting.file.suffix=txt
> ```
> Looking at the docs, this should work, but I keep getting:
> ```
> 2017-03-06 17:15:01 ERROR Task:266 - Task task_events_1488816000695_68 failed
> java.lang.IllegalArgumentException
>       at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:77)
>       at 
> gobblin.util.limiter.DefaultLimiterFactory.newLimiter(DefaultLimiterFactory.java:74)
>       at gobblin.runtime.TaskContext.getExtractor(TaskContext.java:112)
>       at gobblin.runtime.Task.run(Task.java:127)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> ```
> I'm using gobblin 0.8.0.
> Thanks.
>  
> *Github Url* : https://github.com/linkedin/gobblin/issues/1675 
> *Github Reporter* : *guilhermef* 
> *Github Created At* : 2017-03-06T18:23:07Z 
> *Github Updated At* : 2017-05-19T05:58:26Z 
> h3. Comments 
> ----
> [~abti] wrote on 2017-03-30T00:16:10Z : jira-assignee:lesun
> jira-label:gobblin-os-github
> On Mon, Mar 6, 2017 at 10:23 AM, Guilherme Souza <notificati...@github.com>
> wrote:
> > Hi,
> > I'm trying to set a time limit on Gobblin when extracting data from kafka,
> > my job config looks like this:
> >
> > job.name=events
> > job.group=events
> > job.description=events
> > job.lock.enabled=true
> > job.schedule=0 0 0/1 1/1 * ? *
> >
> > kafka.brokers=${env:KAFKA_BROKERS}
> > kafka.workunit.packer.type=BI_LEVEL
> >
> > source.class=gobblin.source.extractor.extract.kafka.KafkaSimpleSource
> > extract.namespace=gobblin.extract.kafka
> > extract.limit.enabled=true
> > extract.limit.type=time
> > extract.limit.time.limit=120
> > extract.limit.time.limit.timeunit=minutes
> >
> > converter.classes=com.custom.gobblin.GobblinBytesToAvroConverter
> >
> > writer.partitioner.class=com.custom.gobblin.GobblinDailyPartitioner
> > writer.partition.pattern=YYYY/MM/dd
> > writer.partition.columns=timestamp
> > writer.builder.class=gobblin.writer.AvroDataWriterBuilder
> > writer.file.path.type=tablename
> > writer.partition.timezone=UTC
> > writer.destination.type=HDFS
> > writer.output.format=AVRO
> >
> > writer.codec.type=SNAPPY
> > writer.deflate.level=6writer.include.partition.in.file.names=false
> >
> > mr.job.max.mappers=5
> >
> > topic.whitelist=.*
> > data.publisher.type=gobblin.publisher.TimePartitionedDataPublisher
> >
> > metrics.reporting.file.enabled=true
> > metrics.log.dir=${env:GOBBLIN_WORK_DIR}/metrics
> > metrics.reporting.file.suffix=txt
> >
> > Looking at the docs, this should work, but I keep getting:
> >
> > 2017-03-06 17:15:01 ERROR Task:266 - Task task_events_1488816000695_68 
> > failed
> > java.lang.IllegalArgumentException
> >     at 
> > com.google.common.base.Preconditions.checkArgument(Preconditions.java:77)
> >     at 
> > gobblin.util.limiter.DefaultLimiterFactory.newLimiter(DefaultLimiterFactory.java:74)
> >     at gobblin.runtime.TaskContext.getExtractor(TaskContext.java:112)
> >     at gobblin.runtime.Task.run(Task.java:127)
> >     at 
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> >     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >     at 
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> >     at 
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> >     at 
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >     at 
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >     at java.lang.Thread.run(Thread.java:745)
> >
> > Thanks.
> >
> > —
> > You are receiving this because you are subscribed to this thread.
> > Reply to this email directly, view it on GitHub
> > <https://github.com/linkedin/gobblin/issues/1675>, or mute the thread
> > <https://github.com/notifications/unsubscribe-auth/AAEPe5Mge09MDxpWNEjMlTXs6SZm9BIBks5rjE8MgaJpZM4MUgwu>
> > .
> >
>  
>  
> *Github Url* : 
> https://github.com/linkedin/gobblin/issues/1675#issuecomment-290264501 
> ----
> [~ibuenros] wrote on 2017-03-30T00:46:36Z : Hello,
> You are using the wrong key. Please use extract.limit.timeLimit and
> extract.limit.timeLimitTimeunit instead. If this is wrong in the docs,
> can you please point us to the location so we can fix it?
> Thanks,
> Issac
> On Mon, Mar 6, 2017 at 10:23 AM, Guilherme Souza <notificati...@github.com>
> wrote:
> > Hi,
> > I'm trying to set a time limit on Gobblin when extracting data from kafka,
> > my job config looks like this:
> >
> > job.name=events
> > job.group=events
> > job.description=events
> > job.lock.enabled=true
> > job.schedule=0 0 0/1 1/1 * ? *
> >
> > kafka.brokers=${env:KAFKA_BROKERS}
> > kafka.workunit.packer.type=BI_LEVEL
> >
> > source.class=gobblin.source.extractor.extract.kafka.KafkaSimpleSource
> > extract.namespace=gobblin.extract.kafka
> > extract.limit.enabled=true
> > extract.limit.type=time
> > extract.limit.time.limit=120
> > extract.limit.time.limit.timeunit=minutes
> >
> > converter.classes=com.custom.gobblin.GobblinBytesToAvroConverter
> >
> > writer.partitioner.class=com.custom.gobblin.GobblinDailyPartitioner
> > writer.partition.pattern=YYYY/MM/dd
> > writer.partition.columns=timestamp
> > writer.builder.class=gobblin.writer.AvroDataWriterBuilder
> > writer.file.path.type=tablename
> > writer.partition.timezone=UTC
> > writer.destination.type=HDFS
> > writer.output.format=AVRO
> >
> > writer.codec.type=SNAPPY
> > writer.deflate.level=6writer.include.partition.in.file.names=false
> >
> > mr.job.max.mappers=5
> >
> > topic.whitelist=.*
> > data.publisher.type=gobblin.publisher.TimePartitionedDataPublisher
> >
> > metrics.reporting.file.enabled=true
> > metrics.log.dir=${env:GOBBLIN_WORK_DIR}/metrics
> > metrics.reporting.file.suffix=txt
> >
> > Looking at the docs, this should work, but I keep getting:
> >
> > 2017-03-06 17:15:01 ERROR Task:266 - Task task_events_1488816000695_68 
> > failed
> > java.lang.IllegalArgumentException
> >     at 
> > com.google.common.base.Preconditions.checkArgument(Preconditions.java:77)
> >     at 
> > gobblin.util.limiter.DefaultLimiterFactory.newLimiter(DefaultLimiterFactory.java:74)
> >     at gobblin.runtime.TaskContext.getExtractor(TaskContext.java:112)
> >     at gobblin.runtime.Task.run(Task.java:127)
> >     at 
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> >     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >     at 
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> >     at 
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> >     at 
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >     at 
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >     at java.lang.Thread.run(Thread.java:745)
> >
> > Thanks.
> >
> > —
> > You are receiving this because you are subscribed to this thread.
> > Reply to this email directly, view it on GitHub
> > <https://github.com/linkedin/gobblin/issues/1675>, or mute the thread
> > <https://github.com/notifications/unsubscribe-auth/ABTQkB6yYANduaDu-j9ykG27ztSeWMoJks5rjE8NgaJpZM4MUgwu>
> > .
> >
>  
>  
> *Github Url* : 
> https://github.com/linkedin/gobblin/issues/1675#issuecomment-290268851 
> ----
> *madiks* wrote on 2017-05-19T05:58:26Z : I found the location in the doc, 
> it's in Case Studies » Kafka-HDFS Ingestion
> https://github.com/linkedin/gobblin/blob/master/gobblin-docs/case-studies/Kafka-HDFS-Ingestion.md
> https://gobblin.readthedocs.io/en/latest/case-studies/Kafka-HDFS-Ingestion/
>  
>  
> *Github Url* : 
> https://github.com/linkedin/gobblin/issues/1675#issuecomment-302615980



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to