[jira] [Created] (GOBBLIN-313) Option to explicitly set group name for staging and final destination directories for Avro-To-Orc conversion
Aditya Sharma created GOBBLIN-313: - Summary: Option to explicitly set group name for staging and final destination directories for Avro-To-Orc conversion Key: GOBBLIN-313 URL: https://issues.apache.org/jira/browse/GOBBLIN-313 Project: Apache Gobblin Issue Type: Improvement Reporter: Aditya Sharma Assignee: Aditya Sharma Currently Avro-To-Orc conversion job tries to preserve group name during conversion. That is the group name for the destination directory will be the same as source directory. There should be an option explicitly define the group name for top-level destination directory and immediate child directories (staging/final directory) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (GOBBLIN-312) Pass extra kafka configuration to the KafkaConsumer in KafkaSimpleStreamingSource
[ https://issues.apache.org/jira/browse/GOBBLIN-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hung Tran resolved GOBBLIN-312. --- Resolution: Fixed Issue resolved by pull request #2166 [https://github.com/apache/incubator-gobblin/pull/2166] > Pass extra kafka configuration to the KafkaConsumer in > KafkaSimpleStreamingSource > - > > Key: GOBBLIN-312 > URL: https://issues.apache.org/jira/browse/GOBBLIN-312 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran > > Pass extra configuration to the KafkaConsumer. One use case is SSL > configuration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (GOBBLIN-312) Pass extra kafka configuration to the KafkaConsumer in KafkaSimpleStreamingSource
[ https://issues.apache.org/jira/browse/GOBBLIN-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hung Tran updated GOBBLIN-312: -- Sprint: Apache Gobblin 170905 > Pass extra kafka configuration to the KafkaConsumer in > KafkaSimpleStreamingSource > - > > Key: GOBBLIN-312 > URL: https://issues.apache.org/jira/browse/GOBBLIN-312 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran > > Pass extra configuration to the KafkaConsumer. One use case is SSL > configuration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (GOBBLIN-312) Pass extra kafka configuration to the KafkaConsumer in KafkaSimpleStreamingSource
Hung Tran created GOBBLIN-312: - Summary: Pass extra kafka configuration to the KafkaConsumer in KafkaSimpleStreamingSource Key: GOBBLIN-312 URL: https://issues.apache.org/jira/browse/GOBBLIN-312 Project: Apache Gobblin Issue Type: Task Reporter: Hung Tran Pass extra configuration to the KafkaConsumer. One use case is SSL configuration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (GOBBLIN-312) Pass extra kafka configuration to the KafkaConsumer in KafkaSimpleStreamingSource
[ https://issues.apache.org/jira/browse/GOBBLIN-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hung Tran reassigned GOBBLIN-312: - Assignee: Hung Tran > Pass extra kafka configuration to the KafkaConsumer in > KafkaSimpleStreamingSource > - > > Key: GOBBLIN-312 > URL: https://issues.apache.org/jira/browse/GOBBLIN-312 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran > > Pass extra configuration to the KafkaConsumer. One use case is SSL > configuration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (GOBBLIN-311) Gobblin AWS runs old jobs when cluster is restarted.
Joel Baranick created GOBBLIN-311: - Summary: Gobblin AWS runs old jobs when cluster is restarted. Key: GOBBLIN-311 URL: https://issues.apache.org/jira/browse/GOBBLIN-311 Project: Apache Gobblin Issue Type: Bug Reporter: Joel Baranick Assignee: Hung Tran On startup of my cluster, old jobs are still attempted. @htran1 said that they should be cleaned up in Standalone mode, but that does not seem compatible with running under AWS: [http://gobblin.readthedocs.io/en/latest/user-guide/Gobblin-Deployment/#standalone-architecture] Also, if I enabled Standalone mode, then {{GobblinClusterManager.sendShutdownRequest()}} won't be called. Additionally, when enabling Standalone mode, GobblinClusterManager will call the following code, which doesn't seem right if I'm running under AWS: {code:java} // In AWS / Yarn mode, the cluster Launcher takes care of setting up Helix cluster /// .. but for Standalone mode, we go via this main() method, so setup the cluster here if (isStandaloneClusterManager) { // Create Helix cluster and connect to it String zkConnectionString = config.getString(GobblinClusterConfigurationKeys.ZK_CONNECTION_STRING_KEY); String helixClusterName = config.getString(GobblinClusterConfigurationKeys.HELIX_CLUSTER_NAME_KEY); HelixUtils.createGobblinHelixCluster(zkConnectionString, helixClusterName, false); LOGGER.info("Created Helix cluster " + helixClusterName); } {code} Thoughts? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (GOBBLIN-310) Skip rerunning completed tasks on mapper reattempts
[ https://issues.apache.org/jira/browse/GOBBLIN-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hung Tran reassigned GOBBLIN-310: - Assignee: Hung Tran > Skip rerunning completed tasks on mapper reattempts > --- > > Key: GOBBLIN-310 > URL: https://issues.apache.org/jira/browse/GOBBLIN-310 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Hung Tran >Assignee: Hung Tran > > Subsequent executions of a failed mapper will rerun completed tasks. This can > result in duplicate data or errors due to collisions when publishing. > The state of completed mappers should be recorded and completed mappers > should be skipped on subsequent attemps. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (GOBBLIN-310) Skip rerunning completed tasks on mapper reattempts
[ https://issues.apache.org/jira/browse/GOBBLIN-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hung Tran updated GOBBLIN-310: -- Sprint: Apache Gobblin 170905 > Skip rerunning completed tasks on mapper reattempts > --- > > Key: GOBBLIN-310 > URL: https://issues.apache.org/jira/browse/GOBBLIN-310 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Hung Tran >Assignee: Hung Tran > > Subsequent executions of a failed mapper will rerun completed tasks. This can > result in duplicate data or errors due to collisions when publishing. > The state of completed mappers should be recorded and completed mappers > should be skipped on subsequent attemps. -- This message was sent by Atlassian JIRA (v6.4.14#64029)