[jira] [Work logged] (GOBBLIN-829) Fix codecov
[ https://issues.apache.org/jira/browse/GOBBLIN-829?focusedWorklogId=277089=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-277089 ] ASF GitHub Bot logged work on GOBBLIN-829: -- Author: ASF GitHub Bot Created on: 16/Jul/19 00:52 Start Date: 16/Jul/19 00:52 Worklog Time Spent: 10m Work Description: autumnust commented on issue #2690: [GOBBLIN-829] Fix codecov URL: https://github.com/apache/incubator-gobblin/pull/2690#issuecomment-511621777 @htran1 Report finally shows up lol. Please merge this fix. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 277089) Time Spent: 0.5h (was: 20m) > Fix codecov > --- > > Key: GOBBLIN-829 > URL: https://issues.apache.org/jira/browse/GOBBLIN-829 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Lei Sun >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] autumnust commented on issue #2690: [GOBBLIN-829] Fix codecov
autumnust commented on issue #2690: [GOBBLIN-829] Fix codecov URL: https://github.com/apache/incubator-gobblin/pull/2690#issuecomment-511621777 @htran1 Report finally shows up lol. Please merge this fix. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-829) Fix codecov
[ https://issues.apache.org/jira/browse/GOBBLIN-829?focusedWorklogId=277086=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-277086 ] ASF GitHub Bot logged work on GOBBLIN-829: -- Author: ASF GitHub Bot Created on: 16/Jul/19 00:43 Start Date: 16/Jul/19 00:43 Worklog Time Spent: 10m Work Description: codecov-io commented on issue #2690: [GOBBLIN-829] Fix codecov URL: https://github.com/apache/incubator-gobblin/pull/2690#issuecomment-511620277 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2690?src=pr=h1) Report > :exclamation: No coverage uploaded for pull request base (`master@31d69f2`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit). > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2690/graphs/tree.svg?width=650=4MgURJ0bGc=150=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2690?src=pr=tree) ```diff @@Coverage Diff@@ ## master#2690 +/- ## = Coverage ? 44.72% Complexity? 8650 = Files ? 1875 Lines ?69887 Branches ? 7687 = Hits ?31257 Misses?35737 Partials ? 2893 ``` -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2690?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2690?src=pr=footer). Last update [31d69f2...2f0f482](https://codecov.io/gh/apache/incubator-gobblin/pull/2690?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 277086) Time Spent: 20m (was: 10m) > Fix codecov > --- > > Key: GOBBLIN-829 > URL: https://issues.apache.org/jira/browse/GOBBLIN-829 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Lei Sun >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] codecov-io commented on issue #2690: [GOBBLIN-829] Fix codecov
codecov-io commented on issue #2690: [GOBBLIN-829] Fix codecov URL: https://github.com/apache/incubator-gobblin/pull/2690#issuecomment-511620277 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2690?src=pr=h1) Report > :exclamation: No coverage uploaded for pull request base (`master@31d69f2`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit). > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2690/graphs/tree.svg?width=650=4MgURJ0bGc=150=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2690?src=pr=tree) ```diff @@Coverage Diff@@ ## master#2690 +/- ## = Coverage ? 44.72% Complexity? 8650 = Files ? 1875 Lines ?69887 Branches ? 7687 = Hits ?31257 Misses?35737 Partials ? 2893 ``` -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2690?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2690?src=pr=footer). Last update [31d69f2...2f0f482](https://codecov.io/gh/apache/incubator-gobblin/pull/2690?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-829) Fix codecov
[ https://issues.apache.org/jira/browse/GOBBLIN-829?focusedWorklogId=277078=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-277078 ] ASF GitHub Bot logged work on GOBBLIN-829: -- Author: ASF GitHub Bot Created on: 16/Jul/19 00:11 Start Date: 16/Jul/19 00:11 Worklog Time Spent: 10m Work Description: autumnust commented on pull request #2690: [GOBBLIN-829] Fix codecov URL: https://github.com/apache/incubator-gobblin/pull/2690 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-XXX ### Description - [ ] Here are some details about my PR, including screenshots (if applicable): Wasn't executing jacocoReport originally ... ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 277078) Time Spent: 10m Remaining Estimate: 0h > Fix codecov > --- > > Key: GOBBLIN-829 > URL: https://issues.apache.org/jira/browse/GOBBLIN-829 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Lei Sun >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] autumnust opened a new pull request #2690: [GOBBLIN-829] Fix codecov
autumnust opened a new pull request #2690: [GOBBLIN-829] Fix codecov URL: https://github.com/apache/incubator-gobblin/pull/2690 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-XXX ### Description - [ ] Here are some details about my PR, including screenshots (if applicable): Wasn't executing jacocoReport originally ... ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (GOBBLIN-829) Fix codecov
Lei Sun created GOBBLIN-829: --- Summary: Fix codecov Key: GOBBLIN-829 URL: https://issues.apache.org/jira/browse/GOBBLIN-829 Project: Apache Gobblin Issue Type: Improvement Reporter: Lei Sun -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] jack-moseley opened a new pull request #2689: [GOBBLIN-828] Make dynamic config override job config
jack-moseley opened a new pull request #2689: [GOBBLIN-828] Make dynamic config override job config URL: https://github.com/apache/incubator-gobblin/pull/2689 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-828 ### Description - [x] Here are some details about my PR, including screenshots (if applicable): Make dynamic config override job config instead of other way around ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Trivial ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-828) Make dynamic config override job config
[ https://issues.apache.org/jira/browse/GOBBLIN-828?focusedWorklogId=277077=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-277077 ] ASF GitHub Bot logged work on GOBBLIN-828: -- Author: ASF GitHub Bot Created on: 16/Jul/19 00:04 Start Date: 16/Jul/19 00:04 Worklog Time Spent: 10m Work Description: jack-moseley commented on pull request #2689: [GOBBLIN-828] Make dynamic config override job config URL: https://github.com/apache/incubator-gobblin/pull/2689 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-828 ### Description - [x] Here are some details about my PR, including screenshots (if applicable): Make dynamic config override job config instead of other way around ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Trivial ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 277077) Time Spent: 10m Remaining Estimate: 0h > Make dynamic config override job config > --- > > Key: GOBBLIN-828 > URL: https://issues.apache.org/jira/browse/GOBBLIN-828 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Jack Moseley >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] vikrambohra commented on a change in pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every messa
vikrambohra commented on a change in pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686#discussion_r303680201 ## File path: gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/metrics/reporter/KeyValueEventObjectReporter.java ## @@ -83,17 +81,24 @@ public KeyValueEventObjectReporter(Builder builder) { "Key not assigned from config. Please set it with property {} Using randomly generated number {} as key ", ConfigurationKeys.METRICS_REPORTING_EVENTS_PUSHERKEYS, randomKey); } + +schema = AvroUtils.overrideNameAndNamespace(GobblinTrackingEvent.getClassSchema(), builder.topic, builder.namespaceOverride); } @Override public void reportEventQueue(Queue queue) { -log.info("Emitting report using KeyValueEventObjectReporter"); List> events = Lists.newArrayList(); GobblinTrackingEvent event; while (null != (event = queue.poll())) { - GenericRecord record = AvroUtils.overrideNameAndNamespace(event, this.topic, this.namespaceOverride); + + GenericRecord record=event; + try { +record = AvroUtils.convertRecordSchema(event, schema); Review comment: Its not on every event but every new instance of the reporter. So it's probably the order of mappers than number of messages sent out which is probably in the order of thousands. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-825) Cache record schema in Plain Object reporters rather than create a new schema each time
[ https://issues.apache.org/jira/browse/GOBBLIN-825?focusedWorklogId=277070=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-277070 ] ASF GitHub Bot logged work on GOBBLIN-825: -- Author: ASF GitHub Bot Created on: 15/Jul/19 23:45 Start Date: 15/Jul/19 23:45 Worklog Time Spent: 10m Work Description: vikrambohra commented on pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686#discussion_r303680201 ## File path: gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/metrics/reporter/KeyValueEventObjectReporter.java ## @@ -83,17 +81,24 @@ public KeyValueEventObjectReporter(Builder builder) { "Key not assigned from config. Please set it with property {} Using randomly generated number {} as key ", ConfigurationKeys.METRICS_REPORTING_EVENTS_PUSHERKEYS, randomKey); } + +schema = AvroUtils.overrideNameAndNamespace(GobblinTrackingEvent.getClassSchema(), builder.topic, builder.namespaceOverride); } @Override public void reportEventQueue(Queue queue) { -log.info("Emitting report using KeyValueEventObjectReporter"); List> events = Lists.newArrayList(); GobblinTrackingEvent event; while (null != (event = queue.poll())) { - GenericRecord record = AvroUtils.overrideNameAndNamespace(event, this.topic, this.namespaceOverride); + + GenericRecord record=event; + try { +record = AvroUtils.convertRecordSchema(event, schema); Review comment: Its not on every event but every new instance of the reporter. So it's probably the order of mappers than number of messages sent out which is probably in the order of thousands. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 277070) Time Spent: 1h 20m (was: 1h 10m) > Cache record schema in Plain Object reporters rather than create a new schema > each time > --- > > Key: GOBBLIN-825 > URL: https://issues.apache.org/jira/browse/GOBBLIN-825 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Vikram Bohra >Priority: Major > Fix For: 0.15.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Rather than create a new instance of the same schema each time, it is better > to create once and re-use. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-828) Make dynamic config override job config
Jack Moseley created GOBBLIN-828: Summary: Make dynamic config override job config Key: GOBBLIN-828 URL: https://issues.apache.org/jira/browse/GOBBLIN-828 Project: Apache Gobblin Issue Type: Bug Reporter: Jack Moseley -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-827) Add more events
[ https://issues.apache.org/jira/browse/GOBBLIN-827?focusedWorklogId=277067=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-277067 ] ASF GitHub Bot logged work on GOBBLIN-827: -- Author: ASF GitHub Bot Created on: 15/Jul/19 23:31 Start Date: 15/Jul/19 23:31 Worklog Time Spent: 10m Work Description: zxcware commented on pull request #2688: [GOBBLIN-827] Add more events URL: https://github.com/apache/incubator-gobblin/pull/2688 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. - https://issues.apache.org/jira/browse/GOBBLIN-827 ### Description - [x] Here are some details about my PR: Add the following events - `JobStateEventBuilder` to report gobblin job state or MR job state - `EntityMissingEventBuilder` to report a missing instance of a certain entity ### Tests - [ ] My PR adds the following unit tests: TBD ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 277067) Time Spent: 10m Remaining Estimate: 0h > Add more events > --- > > Key: GOBBLIN-827 > URL: https://issues.apache.org/jira/browse/GOBBLIN-827 > Project: Apache Gobblin > Issue Type: Task >Reporter: Zhixiong Chen >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Add the following events > - `JobStateEventBuilder` to report gobblin job state or MR job state > - `EntityMissingEventBuilder` to report a missing instance of a certain entity -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] zxcware opened a new pull request #2688: [GOBBLIN-827] Add more events
zxcware opened a new pull request #2688: [GOBBLIN-827] Add more events URL: https://github.com/apache/incubator-gobblin/pull/2688 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. - https://issues.apache.org/jira/browse/GOBBLIN-827 ### Description - [x] Here are some details about my PR: Add the following events - `JobStateEventBuilder` to report gobblin job state or MR job state - `EntityMissingEventBuilder` to report a missing instance of a certain entity ### Tests - [ ] My PR adds the following unit tests: TBD ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (GOBBLIN-827) Add more events
Zhixiong Chen created GOBBLIN-827: - Summary: Add more events Key: GOBBLIN-827 URL: https://issues.apache.org/jira/browse/GOBBLIN-827 Project: Apache Gobblin Issue Type: Task Reporter: Zhixiong Chen Add the following events - `JobStateEventBuilder` to report gobblin job state or MR job state - `EntityMissingEventBuilder` to report a missing instance of a certain entity -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-826) Refactor topic-specific WU configuration in kafka source
[ https://issues.apache.org/jira/browse/GOBBLIN-826?focusedWorklogId=277058=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-277058 ] ASF GitHub Bot logged work on GOBBLIN-826: -- Author: ASF GitHub Bot Created on: 15/Jul/19 23:11 Start Date: 15/Jul/19 23:11 Worklog Time Spent: 10m Work Description: autumnust commented on pull request #2687: [GOBBLIN-826] Refactor topic-specific configuration injection of WU in kafka source URL: https://github.com/apache/incubator-gobblin/pull/2687 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-XXX ### Description - [ ] Here are some details about my PR, including screenshots (if applicable): ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 277058) Time Spent: 10m Remaining Estimate: 0h > Refactor topic-specific WU configuration in kafka source > > > Key: GOBBLIN-826 > URL: https://issues.apache.org/jira/browse/GOBBLIN-826 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Lei Sun >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] autumnust opened a new pull request #2687: [GOBBLIN-826] Refactor topic-specific configuration injection of WU in kafka source
autumnust opened a new pull request #2687: [GOBBLIN-826] Refactor topic-specific configuration injection of WU in kafka source URL: https://github.com/apache/incubator-gobblin/pull/2687 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-XXX ### Description - [ ] Here are some details about my PR, including screenshots (if applicable): ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (GOBBLIN-826) Refactor topic-specific WU configuration in kafka source
Lei Sun created GOBBLIN-826: --- Summary: Refactor topic-specific WU configuration in kafka source Key: GOBBLIN-826 URL: https://issues.apache.org/jira/browse/GOBBLIN-826 Project: Apache Gobblin Issue Type: Improvement Reporter: Lei Sun -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] asfgit closed pull request #2663: [GOBBLIN-796] Add support partial updates for flowConfig
asfgit closed pull request #2663: [GOBBLIN-796] Add support partial updates for flowConfig URL: https://github.com/apache/incubator-gobblin/pull/2663 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-796) Add support partial updates for flowConfig
[ https://issues.apache.org/jira/browse/GOBBLIN-796?focusedWorklogId=277018=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-277018 ] ASF GitHub Bot logged work on GOBBLIN-796: -- Author: ASF GitHub Bot Created on: 15/Jul/19 22:02 Start Date: 15/Jul/19 22:02 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2663: [GOBBLIN-796] Add support partial updates for flowConfig URL: https://github.com/apache/incubator-gobblin/pull/2663 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 277018) Time Spent: 2h 50m (was: 2h 40m) > Add support partial updates for flowConfig > -- > > Key: GOBBLIN-796 > URL: https://issues.apache.org/jira/browse/GOBBLIN-796 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jack Moseley >Priority: Major > Fix For: 0.15.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (GOBBLIN-796) Add support partial updates for flowConfig
[ https://issues.apache.org/jira/browse/GOBBLIN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hung Tran resolved GOBBLIN-796. --- Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request #2663 [https://github.com/apache/incubator-gobblin/pull/2663] > Add support partial updates for flowConfig > -- > > Key: GOBBLIN-796 > URL: https://issues.apache.org/jira/browse/GOBBLIN-796 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jack Moseley >Priority: Major > Fix For: 0.15.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (GOBBLIN-825) Cache record schema in Plain Object reporters rather than create a new schema each time
[ https://issues.apache.org/jira/browse/GOBBLIN-825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hung Tran resolved GOBBLIN-825. --- Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request #2686 [https://github.com/apache/incubator-gobblin/pull/2686] > Cache record schema in Plain Object reporters rather than create a new schema > each time > --- > > Key: GOBBLIN-825 > URL: https://issues.apache.org/jira/browse/GOBBLIN-825 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Vikram Bohra >Priority: Major > Fix For: 0.15.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Rather than create a new instance of the same schema each time, it is better > to create once and re-use. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-825) Cache record schema in Plain Object reporters rather than create a new schema each time
[ https://issues.apache.org/jira/browse/GOBBLIN-825?focusedWorklogId=277004=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-277004 ] ASF GitHub Bot logged work on GOBBLIN-825: -- Author: ASF GitHub Bot Created on: 15/Jul/19 21:38 Start Date: 15/Jul/19 21:38 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 277004) Time Spent: 1h 10m (was: 1h) > Cache record schema in Plain Object reporters rather than create a new schema > each time > --- > > Key: GOBBLIN-825 > URL: https://issues.apache.org/jira/browse/GOBBLIN-825 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Vikram Bohra >Priority: Major > Fix For: 0.15.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Rather than create a new instance of the same schema each time, it is better > to create once and re-use. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] asfgit closed pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message
asfgit closed pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-825) Cache record schema in Plain Object reporters rather than create a new schema each time
[ https://issues.apache.org/jira/browse/GOBBLIN-825?focusedWorklogId=277003=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-277003 ] ASF GitHub Bot logged work on GOBBLIN-825: -- Author: ASF GitHub Bot Created on: 15/Jul/19 21:37 Start Date: 15/Jul/19 21:37 Worklog Time Spent: 10m Work Description: htran1 commented on pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686#discussion_r303647832 ## File path: gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/metrics/reporter/KeyValueEventObjectReporter.java ## @@ -83,17 +81,24 @@ public KeyValueEventObjectReporter(Builder builder) { "Key not assigned from config. Please set it with property {} Using randomly generated number {} as key ", ConfigurationKeys.METRICS_REPORTING_EVENTS_PUSHERKEYS, randomKey); } + +schema = AvroUtils.overrideNameAndNamespace(GobblinTrackingEvent.getClassSchema(), builder.topic, builder.namespaceOverride); } @Override public void reportEventQueue(Queue queue) { -log.info("Emitting report using KeyValueEventObjectReporter"); List> events = Lists.newArrayList(); GobblinTrackingEvent event; while (null != (event = queue.poll())) { - GenericRecord record = AvroUtils.overrideNameAndNamespace(event, this.topic, this.namespaceOverride); + + GenericRecord record=event; + try { +record = AvroUtils.convertRecordSchema(event, schema); Review comment: This conversion is expensive to do on every event. This is the same behavior as today, so it is not making things worse, but we should consider supporting pluggable compiled schemas to avoid the need to override the namespace. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 277003) Time Spent: 1h (was: 50m) > Cache record schema in Plain Object reporters rather than create a new schema > each time > --- > > Key: GOBBLIN-825 > URL: https://issues.apache.org/jira/browse/GOBBLIN-825 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Vikram Bohra >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Rather than create a new instance of the same schema each time, it is better > to create once and re-use. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] htran1 commented on a change in pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message
htran1 commented on a change in pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686#discussion_r303647832 ## File path: gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/metrics/reporter/KeyValueEventObjectReporter.java ## @@ -83,17 +81,24 @@ public KeyValueEventObjectReporter(Builder builder) { "Key not assigned from config. Please set it with property {} Using randomly generated number {} as key ", ConfigurationKeys.METRICS_REPORTING_EVENTS_PUSHERKEYS, randomKey); } + +schema = AvroUtils.overrideNameAndNamespace(GobblinTrackingEvent.getClassSchema(), builder.topic, builder.namespaceOverride); } @Override public void reportEventQueue(Queue queue) { -log.info("Emitting report using KeyValueEventObjectReporter"); List> events = Lists.newArrayList(); GobblinTrackingEvent event; while (null != (event = queue.poll())) { - GenericRecord record = AvroUtils.overrideNameAndNamespace(event, this.topic, this.namespaceOverride); + + GenericRecord record=event; + try { +record = AvroUtils.convertRecordSchema(event, schema); Review comment: This conversion is expensive to do on every event. This is the same behavior as today, so it is not making things worse, but we should consider supporting pluggable compiled schemas to avoid the need to override the namespace. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-825) Cache record schema in Plain Object reporters rather than create a new schema each time
[ https://issues.apache.org/jira/browse/GOBBLIN-825?focusedWorklogId=276928=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-276928 ] ASF GitHub Bot logged work on GOBBLIN-825: -- Author: ASF GitHub Bot Created on: 15/Jul/19 18:50 Start Date: 15/Jul/19 18:50 Worklog Time Spent: 10m Work Description: vikrambohra commented on pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686#discussion_r303583264 ## File path: gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/metrics/reporter/KeyValueEventObjectReporter.java ## @@ -58,6 +60,7 @@ protected KeyValuePusher pusher; protected Optional> namespaceOverride; Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 276928) Time Spent: 50m (was: 40m) > Cache record schema in Plain Object reporters rather than create a new schema > each time > --- > > Key: GOBBLIN-825 > URL: https://issues.apache.org/jira/browse/GOBBLIN-825 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Vikram Bohra >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Rather than create a new instance of the same schema each time, it is better > to create once and re-use. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] vikrambohra commented on a change in pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every messa
vikrambohra commented on a change in pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686#discussion_r303583264 ## File path: gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/metrics/reporter/KeyValueEventObjectReporter.java ## @@ -58,6 +60,7 @@ protected KeyValuePusher pusher; protected Optional> namespaceOverride; Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-825) Cache record schema in Plain Object reporters rather than create a new schema each time
[ https://issues.apache.org/jira/browse/GOBBLIN-825?focusedWorklogId=276914=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-276914 ] ASF GitHub Bot logged work on GOBBLIN-825: -- Author: ASF GitHub Bot Created on: 15/Jul/19 18:31 Start Date: 15/Jul/19 18:31 Worklog Time Spent: 10m Work Description: zxcware commented on pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686#discussion_r303575371 ## File path: gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/metrics/reporter/KeyValueEventObjectReporter.java ## @@ -58,6 +60,7 @@ protected KeyValuePusher pusher; protected Optional> namespaceOverride; protected final String topic; Review comment: remove this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 276914) Time Spent: 40m (was: 0.5h) > Cache record schema in Plain Object reporters rather than create a new schema > each time > --- > > Key: GOBBLIN-825 > URL: https://issues.apache.org/jira/browse/GOBBLIN-825 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Vikram Bohra >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Rather than create a new instance of the same schema each time, it is better > to create once and re-use. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-825) Cache record schema in Plain Object reporters rather than create a new schema each time
[ https://issues.apache.org/jira/browse/GOBBLIN-825?focusedWorklogId=276913=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-276913 ] ASF GitHub Bot logged work on GOBBLIN-825: -- Author: ASF GitHub Bot Created on: 15/Jul/19 18:31 Start Date: 15/Jul/19 18:31 Worklog Time Spent: 10m Work Description: zxcware commented on pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686#discussion_r303575435 ## File path: gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/metrics/reporter/KeyValueEventObjectReporter.java ## @@ -58,6 +60,7 @@ protected KeyValuePusher pusher; protected Optional> namespaceOverride; Review comment: remove this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 276913) Time Spent: 0.5h (was: 20m) > Cache record schema in Plain Object reporters rather than create a new schema > each time > --- > > Key: GOBBLIN-825 > URL: https://issues.apache.org/jira/browse/GOBBLIN-825 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Vikram Bohra >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Rather than create a new instance of the same schema each time, it is better > to create once and re-use. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] zxcware commented on a change in pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message
zxcware commented on a change in pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686#discussion_r303575435 ## File path: gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/metrics/reporter/KeyValueEventObjectReporter.java ## @@ -58,6 +60,7 @@ protected KeyValuePusher pusher; protected Optional> namespaceOverride; Review comment: remove this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] zxcware commented on a change in pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message
zxcware commented on a change in pull request #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686#discussion_r303575371 ## File path: gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/metrics/reporter/KeyValueEventObjectReporter.java ## @@ -58,6 +60,7 @@ protected KeyValuePusher pusher; protected Optional> namespaceOverride; protected final String topic; Review comment: remove this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-825) Cache record schema in Plain Object reporters rather than create a new schema each time
[ https://issues.apache.org/jira/browse/GOBBLIN-825?focusedWorklogId=276899=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-276899 ] ASF GitHub Bot logged work on GOBBLIN-825: -- Author: ASF GitHub Bot Created on: 15/Jul/19 18:02 Start Date: 15/Jul/19 18:02 Worklog Time Spent: 10m Work Description: vikrambohra commented on issue #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686#issuecomment-511507734 @zxcware This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 276899) Time Spent: 20m (was: 10m) > Cache record schema in Plain Object reporters rather than create a new schema > each time > --- > > Key: GOBBLIN-825 > URL: https://issues.apache.org/jira/browse/GOBBLIN-825 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Vikram Bohra >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Rather than create a new instance of the same schema each time, it is better > to create once and re-use. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] vikrambohra commented on issue #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message
vikrambohra commented on issue #2686: [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message URL: https://github.com/apache/incubator-gobblin/pull/2686#issuecomment-511507734 @zxcware This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-825) Cache record schema in Plain Object reporters rather than create a new schema each time
[ https://issues.apache.org/jira/browse/GOBBLIN-825?focusedWorklogId=276897=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-276897 ] ASF GitHub Bot logged work on GOBBLIN-825: -- Author: ASF GitHub Bot Created on: 15/Jul/19 18:01 Start Date: 15/Jul/19 18:01 Worklog Time Spent: 10m Work Description: vikrambohra commented on pull request #2686: [GOBBLIN-825] Make schema a member variable and override name and nam… URL: https://github.com/apache/incubator-gobblin/pull/2686 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-825 ### Description - [ ] Here are some details about my PR, including screenshots (if applicable): An issue with a schema registry client caused this issue. This issue fixes the creation of new instances of the same schema for every message sent. Instead we now create an instance of the schema during object construction and use the same schema instance for all the messages. ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: - overrideSchemaNameAndNamespaceTest in AvroUtilsTest.java - KeyValueEventObjectReporterTest.java - KeyValueMetricObjectReporterTest.java ### Commits - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 276897) Time Spent: 10m Remaining Estimate: 0h > Cache record schema in Plain Object reporters rather than create a new schema > each time > --- > > Key: GOBBLIN-825 > URL: https://issues.apache.org/jira/browse/GOBBLIN-825 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Vikram Bohra >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Rather than create a new instance of the same schema each time, it is better > to create once and re-use. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] vikrambohra opened a new pull request #2686: [GOBBLIN-825] Make schema a member variable and override name and nam…
vikrambohra opened a new pull request #2686: [GOBBLIN-825] Make schema a member variable and override name and nam… URL: https://github.com/apache/incubator-gobblin/pull/2686 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-825 ### Description - [ ] Here are some details about my PR, including screenshots (if applicable): An issue with a schema registry client caused this issue. This issue fixes the creation of new instances of the same schema for every message sent. Instead we now create an instance of the schema during object construction and use the same schema instance for all the messages. ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: - overrideSchemaNameAndNamespaceTest in AvroUtilsTest.java - KeyValueEventObjectReporterTest.java - KeyValueMetricObjectReporterTest.java ### Commits - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (GOBBLIN-810) Include flow edge ID in job name
[ https://issues.apache.org/jira/browse/GOBBLIN-810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hung Tran resolved GOBBLIN-810. --- Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request #2675 [https://github.com/apache/incubator-gobblin/pull/2675] > Include flow edge ID in job name > > > Key: GOBBLIN-810 > URL: https://issues.apache.org/jira/browse/GOBBLIN-810 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Jack Moseley >Priority: Major > Fix For: 0.15.0 > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-810) Include flow edge ID in job name
[ https://issues.apache.org/jira/browse/GOBBLIN-810?focusedWorklogId=276894=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-276894 ] ASF GitHub Bot logged work on GOBBLIN-810: -- Author: ASF GitHub Bot Created on: 15/Jul/19 17:53 Start Date: 15/Jul/19 17:53 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2675: [GOBBLIN-810] Include flow edge ID in job name URL: https://github.com/apache/incubator-gobblin/pull/2675 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 276894) Time Spent: 1h (was: 50m) > Include flow edge ID in job name > > > Key: GOBBLIN-810 > URL: https://issues.apache.org/jira/browse/GOBBLIN-810 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Jack Moseley >Priority: Major > Fix For: 0.15.0 > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] asfgit closed pull request #2675: [GOBBLIN-810] Include flow edge ID in job name
asfgit closed pull request #2675: [GOBBLIN-810] Include flow edge ID in job name URL: https://github.com/apache/incubator-gobblin/pull/2675 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-796) Add support partial updates for flowConfig
[ https://issues.apache.org/jira/browse/GOBBLIN-796?focusedWorklogId=276879=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-276879 ] ASF GitHub Bot logged work on GOBBLIN-796: -- Author: ASF GitHub Bot Created on: 15/Jul/19 17:31 Start Date: 15/Jul/19 17:31 Worklog Time Spent: 10m Work Description: arjun4084346 commented on issue #2663: [GOBBLIN-796] Add support partial updates for flowConfig URL: https://github.com/apache/incubator-gobblin/pull/2663#issuecomment-511496000 +1. LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 276879) Time Spent: 2h 40m (was: 2.5h) > Add support partial updates for flowConfig > -- > > Key: GOBBLIN-796 > URL: https://issues.apache.org/jira/browse/GOBBLIN-796 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jack Moseley >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] arjun4084346 commented on issue #2663: [GOBBLIN-796] Add support partial updates for flowConfig
arjun4084346 commented on issue #2663: [GOBBLIN-796] Add support partial updates for flowConfig URL: https://github.com/apache/incubator-gobblin/pull/2663#issuecomment-511496000 +1. LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
Re: MR mode wikipedia example fails
missed to mention that i used Hadoop 2.7.7 for this, wonder what version LinkedIn is using. THanks On Sun, Jul 14, 2019 at 6:09 PM Jay Sen wrote: > Hi Dev Team, > > The PullFromWikipedia example fails on Gobblin MR mode. > > the issue I have found so far is that even after the MR job completes > successfully and sseen by gobblin, its explicitly marking it as FAILED due > to missing "workunit.working.state" in WorkUnitState ( at SafeDatasetCommit > # finalizeDatasetStateBeforeCommit method ). > > This is how i believe states are structured, just for the reference here > JobState ->DatasetState-> TaskState -> WorkUnitState > > since its missing in the WorkUniteState, It by default get "PENDING" state > (by taskState.getWorkingState()) and the function > (finalizeDatasetStateBeforeCommit) sets it to FAIL in any other state > other than "SUCCESSFUL". > > Now, I am not sure if this is a bug or i m missing any config > like JobCommitPolicy or to tell not to commit at job level at all. > > can someone pls take a look and comment? > > Thanks > Jay > > >