[GitHub] [incubator-gobblin] jhsenjaliya commented on issue #2586: [GOBBLIN-719] fix invalid git links for classes in docs
jhsenjaliya commented on issue #2586: [GOBBLIN-719] fix invalid git links for classes in docs URL: https://github.com/apache/incubator-gobblin/pull/2586#issuecomment-480150157 @yukuai518 , can you merge the changes, I need to update docs as part of #2578 and it will hekop to avoid conflicts. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-719) gobblin-docs has invalid git links
[ https://issues.apache.org/jira/browse/GOBBLIN-719?focusedWorklogId=223441=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223441 ] ASF GitHub Bot logged work on GOBBLIN-719: -- Author: ASF GitHub Bot Created on: 05/Apr/19 05:09 Start Date: 05/Apr/19 05:09 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2586: [GOBBLIN-719] fix invalid git links for classes in docs URL: https://github.com/apache/incubator-gobblin/pull/2586#issuecomment-480150157 @yukuai518 , can you merge the changes, I need to update docs as part of #2578 and it will hekop to avoid conflicts. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223441) Time Spent: 0.5h (was: 20m) > gobblin-docs has invalid git links > -- > > Key: GOBBLIN-719 > URL: https://issues.apache.org/jira/browse/GOBBLIN-719 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Jay Sen >Priority: Trivial > Time Spent: 0.5h > Remaining Estimate: 0h > > gobblin docs had some invalid links pointing not only LinkedIn repo but also > old location of the classes that has changes since then. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=223440=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223440 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 05/Apr/19 05:09 Start Date: 05/Apr/19 05:09 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r272445741 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: ok sure, it requires log of doc changes, and some reorganization, which i can take care of but can we get #2586 merged? otherwise i ll have lot of conflicts. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223440) Time Spent: 2h 20m (was: 2h 10m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] jhsenjaliya commented on a change in pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command
jhsenjaliya commented on a change in pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r272445741 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: ok sure, it requires log of doc changes, and some reorganization, which i can take care of but can we get #2586 merged? otherwise i ll have lot of conflicts. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-724) Throttling server delays responses for throttling causing too many connections
[ https://issues.apache.org/jira/browse/GOBBLIN-724?focusedWorklogId=223370=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223370 ] ASF GitHub Bot logged work on GOBBLIN-724: -- Author: ASF GitHub Bot Created on: 04/Apr/19 23:47 Start Date: 04/Apr/19 23:47 Worklog Time Spent: 10m Work Description: ibuenros commented on issue #2591: [GOBBLIN-724] Upgrade throttling server so waiting until tokens can be used is done… URL: https://github.com/apache/incubator-gobblin/pull/2591#issuecomment-480102141 @htran1 can you review? I can go over the changes with you if necessary. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223370) Time Spent: 20m (was: 10m) > Throttling server delays responses for throttling causing too many connections > -- > > Key: GOBBLIN-724 > URL: https://issues.apache.org/jira/browse/GOBBLIN-724 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Issac Buenrostro >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Currently, the throttling server implements throttling in part by delaying > the response with the permit allocation. However, when waiting to respond, > the request remains in flight utilizing system resources and severely > limiting how many clients can use the throttling server. > As a fix, the server should respond immediately and ask the client to wait > before distributing the permits. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-724) Throttling server delays responses for throttling causing too many connections
[ https://issues.apache.org/jira/browse/GOBBLIN-724?focusedWorklogId=223366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223366 ] ASF GitHub Bot logged work on GOBBLIN-724: -- Author: ASF GitHub Bot Created on: 04/Apr/19 23:47 Start Date: 04/Apr/19 23:47 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2591: [GOBBLIN-724] Upgrade throttling server so waiting until tokens can be used is done… URL: https://github.com/apache/incubator-gobblin/pull/2591 … by the client instead of the server. See GOBBLIN-724. Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-XXX ### Description - [ ] Here are some details about my PR, including screenshots (if applicable): ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223366) Time Spent: 10m Remaining Estimate: 0h > Throttling server delays responses for throttling causing too many connections > -- > > Key: GOBBLIN-724 > URL: https://issues.apache.org/jira/browse/GOBBLIN-724 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Issac Buenrostro >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently, the throttling server implements throttling in part by delaying > the response with the permit allocation. However, when waiting to respond, > the request remains in flight utilizing system resources and severely > limiting how many clients can use the throttling server. > As a fix, the server should respond immediately and ask the client to wait > before distributing the permits. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (GOBBLIN-724) Throttling server delays responses for throttling causing too many connections
Issac Buenrostro created GOBBLIN-724: Summary: Throttling server delays responses for throttling causing too many connections Key: GOBBLIN-724 URL: https://issues.apache.org/jira/browse/GOBBLIN-724 Project: Apache Gobblin Issue Type: Bug Reporter: Issac Buenrostro Currently, the throttling server implements throttling in part by delaying the response with the permit allocation. However, when waiting to respond, the request remains in flight utilizing system resources and severely limiting how many clients can use the throttling server. As a fix, the server should respond immediately and ask the client to wait before distributing the permits. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] ibuenros commented on issue #2591: [GOBBLIN-724] Upgrade throttling server so waiting until tokens can be used is done…
ibuenros commented on issue #2591: [GOBBLIN-724] Upgrade throttling server so waiting until tokens can be used is done… URL: https://github.com/apache/incubator-gobblin/pull/2591#issuecomment-480102141 @htran1 can you review? I can go over the changes with you if necessary. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] ibuenros opened a new pull request #2591: [GOBBLIN-724] Upgrade throttling server so waiting until tokens can be used is done…
ibuenros opened a new pull request #2591: [GOBBLIN-724] Upgrade throttling server so waiting until tokens can be used is done… URL: https://github.com/apache/incubator-gobblin/pull/2591 … by the client instead of the server. See GOBBLIN-724. Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-XXX ### Description - [ ] Here are some details about my PR, including screenshots (if applicable): ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-720) delete the state store whenever a flow is deleted
[ https://issues.apache.org/jira/browse/GOBBLIN-720?focusedWorklogId=223232=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223232 ] ASF GitHub Bot logged work on GOBBLIN-720: -- Author: ASF GitHub Bot Created on: 04/Apr/19 20:24 Start Date: 04/Apr/19 20:24 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2587: [GOBBLIN-720 Always delete state store URL: https://github.com/apache/incubator-gobblin/pull/2587 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223232) Time Spent: 50m (was: 40m) > delete the state store whenever a flow is deleted > - > > Key: GOBBLIN-720 > URL: https://issues.apache.org/jira/browse/GOBBLIN-720 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Arjun Singh Bora >Priority: Major > Fix For: 0.15.0 > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-722) add option to unschedule a gaas flow
[ https://issues.apache.org/jira/browse/GOBBLIN-722?focusedWorklogId=223234=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223234 ] ASF GitHub Bot logged work on GOBBLIN-722: -- Author: ASF GitHub Bot Created on: 04/Apr/19 20:25 Start Date: 04/Apr/19 20:25 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2589: [GOBBLIN-722] Unschedule gaas flow URL: https://github.com/apache/incubator-gobblin/pull/2589 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223234) Time Spent: 40m (was: 0.5h) > add option to unschedule a gaas flow > > > Key: GOBBLIN-722 > URL: https://issues.apache.org/jira/browse/GOBBLIN-722 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Arjun Singh Bora >Priority: Major > Fix For: 0.15.0 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GOBBLIN-722) add option to unschedule a gaas flow
[ https://issues.apache.org/jira/browse/GOBBLIN-722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hung Tran resolved GOBBLIN-722. --- Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request #2589 [https://github.com/apache/incubator-gobblin/pull/2589] > add option to unschedule a gaas flow > > > Key: GOBBLIN-722 > URL: https://issues.apache.org/jira/browse/GOBBLIN-722 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Arjun Singh Bora >Priority: Major > Fix For: 0.15.0 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] asfgit closed pull request #2589: [GOBBLIN-722] Unschedule gaas flow
asfgit closed pull request #2589: [GOBBLIN-722] Unschedule gaas flow URL: https://github.com/apache/incubator-gobblin/pull/2589 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-723) Add support to the LogCopier for copying from multiple source paths
[ https://issues.apache.org/jira/browse/GOBBLIN-723?focusedWorklogId=223229=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223229 ] ASF GitHub Bot logged work on GOBBLIN-723: -- Author: ASF GitHub Bot Created on: 04/Apr/19 20:22 Start Date: 04/Apr/19 20:22 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2590: [GOBBLIN-723] Add support to the LogCopier for copying from multiple … URL: https://github.com/apache/incubator-gobblin/pull/2590 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223229) Time Spent: 20m (was: 10m) > Add support to the LogCopier for copying from multiple source paths > --- > > Key: GOBBLIN-723 > URL: https://issues.apache.org/jira/browse/GOBBLIN-723 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Fix For: 0.15.0 > > Time Spent: 20m > Remaining Estimate: 0h > > The LogCopier should support multiple source paths. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GOBBLIN-723) Add support to the LogCopier for copying from multiple source paths
[ https://issues.apache.org/jira/browse/GOBBLIN-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hung Tran resolved GOBBLIN-723. --- Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request #2590 [https://github.com/apache/incubator-gobblin/pull/2590] > Add support to the LogCopier for copying from multiple source paths > --- > > Key: GOBBLIN-723 > URL: https://issues.apache.org/jira/browse/GOBBLIN-723 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Fix For: 0.15.0 > > Time Spent: 20m > Remaining Estimate: 0h > > The LogCopier should support multiple source paths. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (GOBBLIN-723) Add support to the LogCopier for copying from multiple source paths
Hung Tran created GOBBLIN-723: - Summary: Add support to the LogCopier for copying from multiple source paths Key: GOBBLIN-723 URL: https://issues.apache.org/jira/browse/GOBBLIN-723 Project: Apache Gobblin Issue Type: Task Reporter: Hung Tran Assignee: Hung Tran The LogCopier should support multiple source paths. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-723) Add support to the LogCopier for copying from multiple source paths
[ https://issues.apache.org/jira/browse/GOBBLIN-723?focusedWorklogId=223166=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223166 ] ASF GitHub Bot logged work on GOBBLIN-723: -- Author: ASF GitHub Bot Created on: 04/Apr/19 17:52 Start Date: 04/Apr/19 17:52 Worklog Time Spent: 10m Work Description: htran1 commented on pull request #2590: [GOBBLIN-723] Add support to the LogCopier for copying from multiple … URL: https://github.com/apache/incubator-gobblin/pull/2590 …source paths Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [X] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-723 ### Description - [X] Here are some details about my PR, including screenshots (if applicable): Add support for multiple source paths. The GobblinYarnLogSource will split the string value of LOG_DIRS and configure the LogCopier to look at multiple paths. ### Tests - [X] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Tested with a job on an environment with multiple log directories. ### Commits - [X] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223166) Time Spent: 10m Remaining Estimate: 0h > Add support to the LogCopier for copying from multiple source paths > --- > > Key: GOBBLIN-723 > URL: https://issues.apache.org/jira/browse/GOBBLIN-723 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The LogCopier should support multiple source paths. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-708) Create SqlDatasetDescriptor for JDBC-sourced datasets
[ https://issues.apache.org/jira/browse/GOBBLIN-708?focusedWorklogId=223082=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223082 ] ASF GitHub Bot logged work on GOBBLIN-708: -- Author: ASF GitHub Bot Created on: 04/Apr/19 16:23 Start Date: 04/Apr/19 16:23 Worklog Time Spent: 10m Work Description: zxcware commented on pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets. URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272259273 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/BaseDatasetDescriptor.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gobblin.service.modules.dataset; + +import java.io.IOException; + +import com.google.common.base.Preconditions; +import com.google.common.collect.ImmutableMap; +import com.typesafe.config.Config; +import com.typesafe.config.ConfigFactory; + +import lombok.EqualsAndHashCode; +import lombok.Getter; +import lombok.ToString; + +import org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys; +import org.apache.gobblin.util.ConfigUtils; + +@EqualsAndHashCode (exclude = {"description", "rawConfig"}) +@ToString (exclude = {"description", "rawConfig"}) +public abstract class BaseDatasetDescriptor implements DatasetDescriptor { + @Getter + private final String platform; + @Getter + private final FormatConfig formatConfig; + @Getter + private final boolean isRetentionApplied; + @Getter + private final String description; + @Getter + private final Config rawConfig; + + private static final Config DEFAULT_FALLBACK = + ConfigFactory.parseMap(ImmutableMap.builder() + .put(DatasetDescriptorConfigKeys.PATH_KEY, DatasetDescriptorConfigKeys.DATASET_DESCRIPTOR_CONFIG_ANY) + .put(DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false) + .build()); + + public BaseDatasetDescriptor(Config config) throws IOException { + Preconditions.checkArgument(config.hasPath(DatasetDescriptorConfigKeys.PLATFORM_KEY), "Dataset descriptor config must specify platform"); +this.platform = config.getString(DatasetDescriptorConfigKeys.PLATFORM_KEY).toLowerCase(); +this.formatConfig = new FormatConfig(config); +this.isRetentionApplied = ConfigUtils.getBoolean(config, DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false); +this.description = ConfigUtils.getString(config, DatasetDescriptorConfigKeys.DESCRIPTION_KEY, ""); +this.rawConfig = config.withFallback(this.formatConfig.getRawConfig()).withFallback(DEFAULT_FALLBACK); + } + + /** + * {@inheritDoc} + */ + protected abstract boolean isPathContaining(String otherPath); + + /** + * @return true if this {@link DatasetDescriptor} contains the other {@link DatasetDescriptor} i.e. the + * datasets described by this {@link DatasetDescriptor} is a subset of the datasets described by the other + * {@link DatasetDescriptor}. This operation is non-commutative. + * @param other + */ + @Override + public boolean contains(DatasetDescriptor other) { +if (this == other) { + return true; +} +if (!getClass().equals(other.getClass())) { Review comment: Maybe `other == null || !getClass().equals(other.getClass())`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223082) Time Spent: 50m (was: 40m) > Create SqlDatasetDescriptor for JDBC-sourced datasets > -- > > Key: GOBBLIN-708 > URL: https://issues.apache.org/jira/browse/GOBBLIN-708 > Project: Apache Gobblin > Issue Type: Improvement > Components: gobblin-service >Affects
Re: Gobblin at ApacheCon ?
+1 Last year we received good traction at ApacheCon NA. Totally worth it. @Tamas, if Vegas doesn't work, you should definitely go for ApacheCon EU: https://aceu19.apachecon.com/ Abhishek On Thu, Apr 4, 2019 at 9:16 AM Jay Sen wrote: > I also think would be really helpful to the project, specially when the > core dev team is working hard and trying to make this a top level apache > project. > > -Jay > > On Wed, Apr 3, 2019 at 11:34 PM Tamas Nemeth .invalid> > wrote: > > > I love the idea as well! > > I would be happy to see at Apachecon but Las Vegas is a bit far from > here. > > :( > > > > Tamas > > > > On 2019. Apr 4., Thu at 8:24, Jean-Baptiste Onofré > > wrote: > > > > > It sounds good to me. > > > > > > Regards > > > JB > > > > > > On 03/04/2019 18:59, Jay Sen wrote: > > > > Hi Guys, > > > > > > > > Lets present Apache Gobblin at the ApacheCon. > > > > > > > > I would be interested in presenting/co-presenting PayPal's use-case. > > > > > > > > @PMCs, Please share your thoughts. > > > > > > > > Thanks > > > > Jay > > > > > > > > > > -- > > > Jean-Baptiste Onofré > > > jbono...@apache.org > > > http://blog.nanthrax.net > > > Talend - http://www.talend.com > > > > > >
[jira] [Work logged] (GOBBLIN-708) Create SqlDatasetDescriptor for JDBC-sourced datasets
[ https://issues.apache.org/jira/browse/GOBBLIN-708?focusedWorklogId=223079=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223079 ] ASF GitHub Bot logged work on GOBBLIN-708: -- Author: ASF GitHub Bot Created on: 04/Apr/19 16:23 Start Date: 04/Apr/19 16:23 Worklog Time Spent: 10m Work Description: zxcware commented on pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets. URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272257572 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/BaseDatasetDescriptor.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gobblin.service.modules.dataset; + +import java.io.IOException; + +import com.google.common.base.Preconditions; +import com.google.common.collect.ImmutableMap; +import com.typesafe.config.Config; +import com.typesafe.config.ConfigFactory; + +import lombok.EqualsAndHashCode; +import lombok.Getter; +import lombok.ToString; + +import org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys; +import org.apache.gobblin.util.ConfigUtils; + +@EqualsAndHashCode (exclude = {"description", "rawConfig"}) +@ToString (exclude = {"description", "rawConfig"}) +public abstract class BaseDatasetDescriptor implements DatasetDescriptor { + @Getter + private final String platform; + @Getter + private final FormatConfig formatConfig; + @Getter + private final boolean isRetentionApplied; + @Getter + private final String description; + @Getter + private final Config rawConfig; + + private static final Config DEFAULT_FALLBACK = + ConfigFactory.parseMap(ImmutableMap.builder() + .put(DatasetDescriptorConfigKeys.PATH_KEY, DatasetDescriptorConfigKeys.DATASET_DESCRIPTOR_CONFIG_ANY) + .put(DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false) + .build()); + + public BaseDatasetDescriptor(Config config) throws IOException { + Preconditions.checkArgument(config.hasPath(DatasetDescriptorConfigKeys.PLATFORM_KEY), "Dataset descriptor config must specify platform"); +this.platform = config.getString(DatasetDescriptorConfigKeys.PLATFORM_KEY).toLowerCase(); +this.formatConfig = new FormatConfig(config); +this.isRetentionApplied = ConfigUtils.getBoolean(config, DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false); +this.description = ConfigUtils.getString(config, DatasetDescriptorConfigKeys.DESCRIPTION_KEY, ""); +this.rawConfig = config.withFallback(this.formatConfig.getRawConfig()).withFallback(DEFAULT_FALLBACK); Review comment: Should we favor `formatConfig` over the input `config`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223079) Time Spent: 20m (was: 10m) > Create SqlDatasetDescriptor for JDBC-sourced datasets > -- > > Key: GOBBLIN-708 > URL: https://issues.apache.org/jira/browse/GOBBLIN-708 > Project: Apache Gobblin > Issue Type: Improvement > Components: gobblin-service >Affects Versions: 0.15.0 >Reporter: Sudarshan Vasudevan >Assignee: Abhishek Tiwari >Priority: Major > Fix For: 0.15.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Create a new DatasetDescriptor for JDBC sourced datasets. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-708) Create SqlDatasetDescriptor for JDBC-sourced datasets
[ https://issues.apache.org/jira/browse/GOBBLIN-708?focusedWorklogId=223080=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223080 ] ASF GitHub Bot logged work on GOBBLIN-708: -- Author: ASF GitHub Bot Created on: 04/Apr/19 16:23 Start Date: 04/Apr/19 16:23 Worklog Time Spent: 10m Work Description: zxcware commented on pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets. URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272260089 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/EncryptionConfig.java ## @@ -20,18 +20,21 @@ import java.io.IOException; import com.google.common.base.Enums; -import com.google.common.base.Joiner; import com.google.common.collect.ImmutableMap; import com.typesafe.config.Config; import com.typesafe.config.ConfigFactory; +import lombok.EqualsAndHashCode; import lombok.Getter; +import lombok.ToString; import lombok.extern.slf4j.Slf4j; import org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys; import org.apache.gobblin.util.ConfigUtils; @Slf4j +@ToString(exclude = {"rawConfig"}) Review comment: nice This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223080) Time Spent: 0.5h (was: 20m) > Create SqlDatasetDescriptor for JDBC-sourced datasets > -- > > Key: GOBBLIN-708 > URL: https://issues.apache.org/jira/browse/GOBBLIN-708 > Project: Apache Gobblin > Issue Type: Improvement > Components: gobblin-service >Affects Versions: 0.15.0 >Reporter: Sudarshan Vasudevan >Assignee: Abhishek Tiwari >Priority: Major > Fix For: 0.15.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Create a new DatasetDescriptor for JDBC sourced datasets. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-708) Create SqlDatasetDescriptor for JDBC-sourced datasets
[ https://issues.apache.org/jira/browse/GOBBLIN-708?focusedWorklogId=223081=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223081 ] ASF GitHub Bot logged work on GOBBLIN-708: -- Author: ASF GitHub Bot Created on: 04/Apr/19 16:23 Start Date: 04/Apr/19 16:23 Worklog Time Spent: 10m Work Description: zxcware commented on pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets. URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272259071 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/BaseDatasetDescriptor.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gobblin.service.modules.dataset; + +import java.io.IOException; + +import com.google.common.base.Preconditions; +import com.google.common.collect.ImmutableMap; +import com.typesafe.config.Config; +import com.typesafe.config.ConfigFactory; + +import lombok.EqualsAndHashCode; +import lombok.Getter; +import lombok.ToString; + +import org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys; +import org.apache.gobblin.util.ConfigUtils; + +@EqualsAndHashCode (exclude = {"description", "rawConfig"}) +@ToString (exclude = {"description", "rawConfig"}) +public abstract class BaseDatasetDescriptor implements DatasetDescriptor { + @Getter + private final String platform; + @Getter + private final FormatConfig formatConfig; + @Getter + private final boolean isRetentionApplied; + @Getter + private final String description; + @Getter + private final Config rawConfig; + + private static final Config DEFAULT_FALLBACK = + ConfigFactory.parseMap(ImmutableMap.builder() + .put(DatasetDescriptorConfigKeys.PATH_KEY, DatasetDescriptorConfigKeys.DATASET_DESCRIPTOR_CONFIG_ANY) + .put(DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false) + .build()); + + public BaseDatasetDescriptor(Config config) throws IOException { + Preconditions.checkArgument(config.hasPath(DatasetDescriptorConfigKeys.PLATFORM_KEY), "Dataset descriptor config must specify platform"); +this.platform = config.getString(DatasetDescriptorConfigKeys.PLATFORM_KEY).toLowerCase(); +this.formatConfig = new FormatConfig(config); +this.isRetentionApplied = ConfigUtils.getBoolean(config, DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false); +this.description = ConfigUtils.getString(config, DatasetDescriptorConfigKeys.DESCRIPTION_KEY, ""); +this.rawConfig = config.withFallback(this.formatConfig.getRawConfig()).withFallback(DEFAULT_FALLBACK); + } + + /** + * {@inheritDoc} + */ + protected abstract boolean isPathContaining(String otherPath); + + /** + * @return true if this {@link DatasetDescriptor} contains the other {@link DatasetDescriptor} i.e. the + * datasets described by this {@link DatasetDescriptor} is a subset of the datasets described by the other + * {@link DatasetDescriptor}. This operation is non-commutative. + * @param other + */ + @Override + public boolean contains(DatasetDescriptor other) { +if (this == other) { + return true; +} +if (!getClass().equals(other.getClass())) { + return false; +} + +if (this.getPlatform() == null || other.getPlatform() == null || !this.getPlatform().equalsIgnoreCase(other.getPlatform())) { Review comment: we can simplify as `if (this.getPlatform() == null || !this.getPlatform().equalsIgnoreCase(other.getPlatform())) {` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223081) Time Spent: 40m (was: 0.5h) > Create SqlDatasetDescriptor for JDBC-sourced datasets > -- > > Key:
[GitHub] [incubator-gobblin] zxcware commented on a change in pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets.
zxcware commented on a change in pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets. URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272257572 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/BaseDatasetDescriptor.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gobblin.service.modules.dataset; + +import java.io.IOException; + +import com.google.common.base.Preconditions; +import com.google.common.collect.ImmutableMap; +import com.typesafe.config.Config; +import com.typesafe.config.ConfigFactory; + +import lombok.EqualsAndHashCode; +import lombok.Getter; +import lombok.ToString; + +import org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys; +import org.apache.gobblin.util.ConfigUtils; + +@EqualsAndHashCode (exclude = {"description", "rawConfig"}) +@ToString (exclude = {"description", "rawConfig"}) +public abstract class BaseDatasetDescriptor implements DatasetDescriptor { + @Getter + private final String platform; + @Getter + private final FormatConfig formatConfig; + @Getter + private final boolean isRetentionApplied; + @Getter + private final String description; + @Getter + private final Config rawConfig; + + private static final Config DEFAULT_FALLBACK = + ConfigFactory.parseMap(ImmutableMap.builder() + .put(DatasetDescriptorConfigKeys.PATH_KEY, DatasetDescriptorConfigKeys.DATASET_DESCRIPTOR_CONFIG_ANY) + .put(DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false) + .build()); + + public BaseDatasetDescriptor(Config config) throws IOException { + Preconditions.checkArgument(config.hasPath(DatasetDescriptorConfigKeys.PLATFORM_KEY), "Dataset descriptor config must specify platform"); +this.platform = config.getString(DatasetDescriptorConfigKeys.PLATFORM_KEY).toLowerCase(); +this.formatConfig = new FormatConfig(config); +this.isRetentionApplied = ConfigUtils.getBoolean(config, DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false); +this.description = ConfigUtils.getString(config, DatasetDescriptorConfigKeys.DESCRIPTION_KEY, ""); +this.rawConfig = config.withFallback(this.formatConfig.getRawConfig()).withFallback(DEFAULT_FALLBACK); Review comment: Should we favor `formatConfig` over the input `config`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] zxcware commented on a change in pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets.
zxcware commented on a change in pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets. URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272259071 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/BaseDatasetDescriptor.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gobblin.service.modules.dataset; + +import java.io.IOException; + +import com.google.common.base.Preconditions; +import com.google.common.collect.ImmutableMap; +import com.typesafe.config.Config; +import com.typesafe.config.ConfigFactory; + +import lombok.EqualsAndHashCode; +import lombok.Getter; +import lombok.ToString; + +import org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys; +import org.apache.gobblin.util.ConfigUtils; + +@EqualsAndHashCode (exclude = {"description", "rawConfig"}) +@ToString (exclude = {"description", "rawConfig"}) +public abstract class BaseDatasetDescriptor implements DatasetDescriptor { + @Getter + private final String platform; + @Getter + private final FormatConfig formatConfig; + @Getter + private final boolean isRetentionApplied; + @Getter + private final String description; + @Getter + private final Config rawConfig; + + private static final Config DEFAULT_FALLBACK = + ConfigFactory.parseMap(ImmutableMap.builder() + .put(DatasetDescriptorConfigKeys.PATH_KEY, DatasetDescriptorConfigKeys.DATASET_DESCRIPTOR_CONFIG_ANY) + .put(DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false) + .build()); + + public BaseDatasetDescriptor(Config config) throws IOException { + Preconditions.checkArgument(config.hasPath(DatasetDescriptorConfigKeys.PLATFORM_KEY), "Dataset descriptor config must specify platform"); +this.platform = config.getString(DatasetDescriptorConfigKeys.PLATFORM_KEY).toLowerCase(); +this.formatConfig = new FormatConfig(config); +this.isRetentionApplied = ConfigUtils.getBoolean(config, DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false); +this.description = ConfigUtils.getString(config, DatasetDescriptorConfigKeys.DESCRIPTION_KEY, ""); +this.rawConfig = config.withFallback(this.formatConfig.getRawConfig()).withFallback(DEFAULT_FALLBACK); + } + + /** + * {@inheritDoc} + */ + protected abstract boolean isPathContaining(String otherPath); + + /** + * @return true if this {@link DatasetDescriptor} contains the other {@link DatasetDescriptor} i.e. the + * datasets described by this {@link DatasetDescriptor} is a subset of the datasets described by the other + * {@link DatasetDescriptor}. This operation is non-commutative. + * @param other + */ + @Override + public boolean contains(DatasetDescriptor other) { +if (this == other) { + return true; +} +if (!getClass().equals(other.getClass())) { + return false; +} + +if (this.getPlatform() == null || other.getPlatform() == null || !this.getPlatform().equalsIgnoreCase(other.getPlatform())) { Review comment: we can simplify as `if (this.getPlatform() == null || !this.getPlatform().equalsIgnoreCase(other.getPlatform())) {` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] zxcware commented on a change in pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets.
zxcware commented on a change in pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets. URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272260089 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/EncryptionConfig.java ## @@ -20,18 +20,21 @@ import java.io.IOException; import com.google.common.base.Enums; -import com.google.common.base.Joiner; import com.google.common.collect.ImmutableMap; import com.typesafe.config.Config; import com.typesafe.config.ConfigFactory; +import lombok.EqualsAndHashCode; import lombok.Getter; +import lombok.ToString; import lombok.extern.slf4j.Slf4j; import org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys; import org.apache.gobblin.util.ConfigUtils; @Slf4j +@ToString(exclude = {"rawConfig"}) Review comment: nice This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] zxcware commented on a change in pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets.
zxcware commented on a change in pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets. URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272259273 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/BaseDatasetDescriptor.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gobblin.service.modules.dataset; + +import java.io.IOException; + +import com.google.common.base.Preconditions; +import com.google.common.collect.ImmutableMap; +import com.typesafe.config.Config; +import com.typesafe.config.ConfigFactory; + +import lombok.EqualsAndHashCode; +import lombok.Getter; +import lombok.ToString; + +import org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys; +import org.apache.gobblin.util.ConfigUtils; + +@EqualsAndHashCode (exclude = {"description", "rawConfig"}) +@ToString (exclude = {"description", "rawConfig"}) +public abstract class BaseDatasetDescriptor implements DatasetDescriptor { + @Getter + private final String platform; + @Getter + private final FormatConfig formatConfig; + @Getter + private final boolean isRetentionApplied; + @Getter + private final String description; + @Getter + private final Config rawConfig; + + private static final Config DEFAULT_FALLBACK = + ConfigFactory.parseMap(ImmutableMap.builder() + .put(DatasetDescriptorConfigKeys.PATH_KEY, DatasetDescriptorConfigKeys.DATASET_DESCRIPTOR_CONFIG_ANY) + .put(DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false) + .build()); + + public BaseDatasetDescriptor(Config config) throws IOException { + Preconditions.checkArgument(config.hasPath(DatasetDescriptorConfigKeys.PLATFORM_KEY), "Dataset descriptor config must specify platform"); +this.platform = config.getString(DatasetDescriptorConfigKeys.PLATFORM_KEY).toLowerCase(); +this.formatConfig = new FormatConfig(config); +this.isRetentionApplied = ConfigUtils.getBoolean(config, DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false); +this.description = ConfigUtils.getString(config, DatasetDescriptorConfigKeys.DESCRIPTION_KEY, ""); +this.rawConfig = config.withFallback(this.formatConfig.getRawConfig()).withFallback(DEFAULT_FALLBACK); + } + + /** + * {@inheritDoc} + */ + protected abstract boolean isPathContaining(String otherPath); + + /** + * @return true if this {@link DatasetDescriptor} contains the other {@link DatasetDescriptor} i.e. the + * datasets described by this {@link DatasetDescriptor} is a subset of the datasets described by the other + * {@link DatasetDescriptor}. This operation is non-commutative. + * @param other + */ + @Override + public boolean contains(DatasetDescriptor other) { +if (this == other) { + return true; +} +if (!getClass().equals(other.getClass())) { Review comment: Maybe `other == null || !getClass().equals(other.getClass())`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
Re: Gobblin at ApacheCon ?
I love the idea as well! I would be happy to see at Apachecon but Las Vegas is a bit far from here. :( Tamas On 2019. Apr 4., Thu at 8:24, Jean-Baptiste Onofré wrote: > It sounds good to me. > > Regards > JB > > On 03/04/2019 18:59, Jay Sen wrote: > > Hi Guys, > > > > Lets present Apache Gobblin at the ApacheCon. > > > > I would be interested in presenting/co-presenting PayPal's use-case. > > > > @PMCs, Please share your thoughts. > > > > Thanks > > Jay > > > > -- > Jean-Baptiste Onofré > jbono...@apache.org > http://blog.nanthrax.net > Talend - http://www.talend.com >
Re: Gobblin at ApacheCon ?
It sounds good to me. Regards JB On 03/04/2019 18:59, Jay Sen wrote: > Hi Guys, > > Lets present Apache Gobblin at the ApacheCon. > > I would be interested in presenting/co-presenting PayPal's use-case. > > @PMCs, Please share your thoughts. > > Thanks > Jay > -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=222822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222822 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 04/Apr/19 05:58 Start Date: 04/Apr/19 05:58 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r272026110 ## File path: conf/yarn/application.conf ## @@ -22,15 +22,18 @@ gobblin.yarn.app.name=GobblinYarn gobblin.yarn.app.master.memory.mbs=256 gobblin.yarn.initial.containers=2 gobblin.yarn.container.memory.mbs=512 -gobblin.yarn.conf.dir= -gobblin.yarn.lib.jars.dir= -gobblin.yarn.app.master.files.local=${gobblin.yarn.conf.dir}"/log4j-yarn.properties,"${gobblin.yarn.conf.dir}"/application.conf,"${gobblin.yarn.conf.dir}"/reference.conf" +gobblin.yarn.conf.dir=/tools/gobblin-dist/conf/yarn/ Review comment: this is missed, let me change this to `gobblin.yarn.conf.dir=${GOBBLIN_HOME}/conf/yarn/` will be better than having btw, thanks for catching this, this was my local config. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 222822) Time Spent: 2h 10m (was: 2h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)