Re: [ANNOUNCE] New Committer: Tamas Nemeth

2018-11-16 Thread Vicky Kak
Congrats Tamas. On Fri, Nov 16, 2018 at 1:20 PM Abhishek Tiwari wrote: > Hi Gobblin community, > > The Project Management Committee (PMC) for Apache Gobblin (incubating) > has invited Tamas Nemeth to become a committer and we are pleased to > announce that he has accepted. > > Tamas has been a

Re: Gitter message archive

2018-11-14 Thread Vicky Kak
This is the nice move. On Wed, Nov 14, 2018 at 1:22 PM Abhishek Tiwari wrote: > Hi all, > > I have enabled periodic archival of Gitter messages to our mailing list. > This buys us best of both worlds: > 1. Continued realtime conversation through Gitter > 2. Continuation of conversation at

Re: KafkaSimpleJsonExtractor

2018-04-27 Thread Vicky Kak
he.gobblin.metrics.reporter.KafkaAvroReporterTest. > STARTED > > > > Gradle suite > Gradle test > > org.apache.gobblin.metrics.reporter.KafkaAvroReporterTest. > PASSED > > > > Gradle suite > Gradle test > > org.apache.gobblin.metrics.reporter.Kaf

Re: GAAS feedback.

2018-04-25 Thread Vicky Kak
> > Abhishek > > > On Tue, Apr 24, 2018 at 6:40 AM, Vicky Kak <vicky@gmail.com> wrote: > >> Hi Guys, >> >> I have created multiple JIRA's based on the discussion we had in this >> thread, these are >> https://issues.apache.org/jira/browse/GOBBLIN-

Re: GAAS feedback.

2018-04-24 Thread Vicky Kak
we could deploy the artifacts ( tempaltes into the gaas-service container and libraries in the cluster worker/master nodes) using the docker cp command. Thanks, Vicky On Tue, Apr 3, 2018 at 5:15 AM, Abhishek Tiwari <a...@apache.org> wrote: > Hi Vicky, > > I had a follow-up with Sudarsh

Re: KafkaSimpleJsonExtractor

2018-04-19 Thread Vicky Kak
> Value is null > > > > > > Did you have a chance to research adding annotation, @Alias (“ > KafkaSimpleJsonExtractor”)? > > > > > > *From:* Vicky Kak [mailto:vicky@gmail.com] > *Sent:* Saturday, April 14, 2018 12:19 AM > > *To:* user@

Re: KafkaSimpleJsonExtractor

2018-04-13 Thread Vicky Kak
ource along with property, > gobblin.source.kafka.extractorType. > As you know, KafkaSimpleJsonExtractor. decodeRecord() combines both > ByteArrayBasedKafkaRecord. getKeyBytes() and ByteArrayBasedKafkaRecord. > getMessageBytes(), > these result we seek. > > > > Or did we miss somet

Re: Data Extraction from Oracle and Ingesting to HDFS

2018-04-12 Thread Vicky Kak
Have you looked at this one? https://github.com/apache/incubator-gobblin/blob/master/gobblin-modules/gobblin-sql/src/main/java/org/apache/gobblin/source/extractor/extract/jdbc/OracleSource.java On Thu, Apr 12, 2018 at 9:56 PM, phani bhushan peddi wrote: > Hi, > I am

Re: KafkaSimpleJsonExtractor

2018-04-12 Thread Vicky Kak
; return decodedRecord; > > } > > > > The architecture diagram leads us to think that the exactor is the point > where both Kafka key and value are visible in the Gobblin pipeline: > > https://gobblin.readthedocs.io/en/latest/Gobblin-Architecture/#gob

Gobblin Clustering for Streaming.

2018-03-28 Thread Vicky Kak
Hi Guys, I am in process of using the gobblin cluster to address the streaming use case, I have yet to look at the code. However I would like to validate my understanding and design approaches based of the quantum of data to be ingested via gobblin. Following is how I will classify the gobblin

GAAS feedback.

2018-01-09 Thread Vicky Kak
Hi Guys, I have finally managed to install the GAAS with Standalone Cluster. Here are some of the observations to share 1) I have running the GAAS and Standalone cluster on the same machine and from the same distribution, this will be typically needed for quick setup. Since I have been starting

Re: Zero byte file, need help on Gobbli

2017-11-12 Thread Vicky Kak
t.format=PARQUET > writer.staging.dir=${env:GOBBLIN_WORK_DIR}/${job.name}/staging > writer.partitioner.class=com.mmk.gobblin.writer.partitioner. > MmkSchemaTimestampPartitioner > > > On Nov 13, 2017 11:42 AM, "Vicky Kak" <vicky@gmail.com> wrote: > >> Please explain your use case an

Re: Zero byte file, need help on Gobbli

2017-11-12 Thread Vicky Kak
Please explain your use case and attach the corresponding job configuration and gobblin log file if possible. On Mon, Nov 13, 2017 at 11:02 AM, Mohan wrote: > Some time I'm getting zero byte parquet file, could you please tell me is > there any reason and size of the

Re: Corrupted state file when Jobs are being run in parallel.

2017-11-11 Thread Vicky Kak
) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Thanks, Vicky On Sun, Nov 12, 2017 at 9:40 AM, Vicky Kak <vicky@gmail.com> wrote: > Hi Hung, > > Please find my replies marked in

Re: Corrupted state file when Jobs are being run in parallel.

2017-11-11 Thread Vicky Kak
nsfer state across executions, like cases where a > watermark is used to resume an incremental pull. > > > Hung. > -- > *From:* Vicky Kak <vicky@gmail.com> > *Sent:* Saturday, November 11, 2017 5:58:59 AM > *To:* user@gobblin.incubator.

Re: Possible leak due to Metrics.

2017-11-10 Thread Vicky Kak
, Vicky On Thu, Nov 9, 2017 at 7:24 PM, Vicky Kak <vicky@gmail.com> wrote: > Hi Guys, > > We have got the following use case > > 1) The Custom server implemented in Gobblin which acts as a rest interface > and launches the gobblin Job. > > 2) During the str

Re: Need help on Gobblin

2017-11-09 Thread Vicky Kak
$1$1. > call(GobblinMultiTaskAttempt.java:109) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.Executors$RunnableAdapter. > call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExe

Re: Need help on Gobblin

2017-11-09 Thread Vicky Kak
Can you attach the complete log file and explain in brief the pull file configuration? The above description seems to be a warning and I am not sure where it coming from. Also please state the gobblin version you have been using as it would become easy to look into the code. On Fri, Nov 10, 2017

Re: GAAS installation issues.

2017-10-18 Thread Vicky Kak
, GobblinClusterConfigurationKeys.GOBBLIN_CLUSTER_LOG4J_CONFIGURATION_FILE, GobblinClusterConfigurationKeys.GOBBLIN_CLUSTER_LOG4J_CONFIGURATION_FILE); On Wed, Oct 18, 2017 at 5:10 PM, Vicky Kak <vicky@gmail.com> wrote: > >>ERROR [GobblinServiceManager]

Re: GAAS installation issues.

2017-10-18 Thread Vicky Kak
ties got rid of the above error ${gobblin.service.work.dir} Regards, Vicky On Wed, Oct 18, 2017 at 4:58 PM, Vicky Kak <vicky@gmail.com> wrote: > Hey Guys, > > I have been trying to configure GAAS based and have got the following > observations > > 1) I am using gobblin-servic

GAAS installation issues.

2017-10-18 Thread Vicky Kak
Hey Guys, I have been trying to configure GAAS based and have got the following observations 1) I am using gobblin-service.sh start to start the Orchestrator Application however I don't see it getting started. I am getting the following information in the master.out log as WARN

Streaming Use Case Documentation.

2017-09-14 Thread Vicky Kak
Hey Guys, I am trying to port the streaming use case of twitter streams ( https://dev.twitter.com/streaming/public) to Gobblin. I would like to understand how to start with it, is there any documentation apart from what I see here and the code base?

No Data Partitioning Use case.

2017-09-11 Thread Vicky Kak
Hi Guys, I am wondering if some one in the community had implemented the Gobblin plugin where the Source implementation will not be able to partition the data, it will have a single partition i.e a single WorkUnit entry. We have got a use case where we don't have a way to partition data and and

Re: Partition meta data not present.

2017-09-08 Thread Vicky Kak
ta D > (whole data?) in source. If so what is the work left for workunits? > - What you mean by keeping things in-memory between source / workunits. > That wont be possible for something like Yarn mode. > > Regards, > Abhishek > > On Wed, Sep 6, 2017 at 5:20 AM, Vicky Kak

Re: Partition meta data not present.

2017-09-06 Thread Vicky Kak
. Regards, Vicky On Tue, Sep 5, 2017 at 6:48 PM, Vicky Kak <vicky@gmail.com> wrote: > I am not able to see this email yet in the email archive here > https://lists.apache.org/list.html?user@gobblin.incubator.apache.org > > Can anyone take a note of it and get it working? &

Partition meta data not present.

2017-08-30 Thread Vicky Kak
Hi Guys, We have got a use case where there is no meta data information about the data to be processed in Gobblin. We need to read the whole data chunk and then create a partition, I would be interested to know how this is being addressed by others. Let me explain it with the sample generic data,

Re: Ecplise IDE import hacks

2017-08-06 Thread Vicky Kak
, > Do you want to add this to common-pitfalls? > > Abhishek > > -- Forwarded message -- > From: Vicky Kak <vicky@gmail.com> > Date: Thu, Apr 20, 2017 at 11:24 PM > Subject: Ecplise IDE import hacks > To: gobblin-users <gobblin-us...@google

Re: Gobblin As Service Questions

2017-07-28 Thread Vicky Kak
is configured in the Gobblin Instances where the Jobs should be constructed and triggered. How and where do we configure the SpecExecutorIntance for the Gobblins Instances for which the Jobs can be configured/triggered via GAAS? Thanks, Vicky On Fri, Jul 28, 2017 at 9:07 AM, Vicky Kak <vicky.