[jira] [Created] (GOBBLIN-313) Option to explicitly set group name for staging and final destination directories for Avro-To-Orc conversion

2017-11-09 Thread Aditya Sharma (JIRA)
Aditya Sharma created GOBBLIN-313:
-

 Summary: Option to explicitly set group name for staging and final 
destination directories for Avro-To-Orc conversion
 Key: GOBBLIN-313
 URL: https://issues.apache.org/jira/browse/GOBBLIN-313
 Project: Apache Gobblin
  Issue Type: Improvement
Reporter: Aditya Sharma
Assignee: Aditya Sharma


Currently Avro-To-Orc conversion job tries to preserve group name during 
conversion. That is the group name for the destination directory will be the 
same as source directory.

There should be an option explicitly define the group name for top-level 
destination directory and immediate child directories (staging/final directory)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (GOBBLIN-312) Pass extra kafka configuration to the KafkaConsumer in KafkaSimpleStreamingSource

2017-11-09 Thread Hung Tran (JIRA)

 [ 
https://issues.apache.org/jira/browse/GOBBLIN-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hung Tran resolved GOBBLIN-312.
---
Resolution: Fixed

Issue resolved by pull request #2166
[https://github.com/apache/incubator-gobblin/pull/2166]

> Pass extra kafka configuration to the KafkaConsumer in 
> KafkaSimpleStreamingSource
> -
>
> Key: GOBBLIN-312
> URL: https://issues.apache.org/jira/browse/GOBBLIN-312
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Hung Tran
>Assignee: Hung Tran
>
> Pass extra configuration to the KafkaConsumer. One use case is SSL 
> configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (GOBBLIN-312) Pass extra kafka configuration to the KafkaConsumer in KafkaSimpleStreamingSource

2017-11-09 Thread Hung Tran (JIRA)

 [ 
https://issues.apache.org/jira/browse/GOBBLIN-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hung Tran updated GOBBLIN-312:
--
Sprint: Apache Gobblin 170905

> Pass extra kafka configuration to the KafkaConsumer in 
> KafkaSimpleStreamingSource
> -
>
> Key: GOBBLIN-312
> URL: https://issues.apache.org/jira/browse/GOBBLIN-312
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Hung Tran
>Assignee: Hung Tran
>
> Pass extra configuration to the KafkaConsumer. One use case is SSL 
> configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (GOBBLIN-312) Pass extra kafka configuration to the KafkaConsumer in KafkaSimpleStreamingSource

2017-11-09 Thread Hung Tran (JIRA)
Hung Tran created GOBBLIN-312:
-

 Summary: Pass extra kafka configuration to the KafkaConsumer in 
KafkaSimpleStreamingSource
 Key: GOBBLIN-312
 URL: https://issues.apache.org/jira/browse/GOBBLIN-312
 Project: Apache Gobblin
  Issue Type: Task
Reporter: Hung Tran


Pass extra configuration to the KafkaConsumer. One use case is SSL 
configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (GOBBLIN-312) Pass extra kafka configuration to the KafkaConsumer in KafkaSimpleStreamingSource

2017-11-09 Thread Hung Tran (JIRA)

 [ 
https://issues.apache.org/jira/browse/GOBBLIN-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hung Tran reassigned GOBBLIN-312:
-

Assignee: Hung Tran

> Pass extra kafka configuration to the KafkaConsumer in 
> KafkaSimpleStreamingSource
> -
>
> Key: GOBBLIN-312
> URL: https://issues.apache.org/jira/browse/GOBBLIN-312
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Hung Tran
>Assignee: Hung Tran
>
> Pass extra configuration to the KafkaConsumer. One use case is SSL 
> configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (GOBBLIN-311) Gobblin AWS runs old jobs when cluster is restarted.

2017-11-09 Thread Joel Baranick (JIRA)
Joel Baranick created GOBBLIN-311:
-

 Summary: Gobblin AWS runs old jobs when cluster is restarted.
 Key: GOBBLIN-311
 URL: https://issues.apache.org/jira/browse/GOBBLIN-311
 Project: Apache Gobblin
  Issue Type: Bug
Reporter: Joel Baranick
Assignee: Hung Tran


On startup of my cluster, old jobs are still attempted. @htran1 said that they 
should be cleaned up in Standalone mode, but that does not seem compatible with 
running under AWS: 
[http://gobblin.readthedocs.io/en/latest/user-guide/Gobblin-Deployment/#standalone-architecture]
Also, if I enabled Standalone mode, then 
{{GobblinClusterManager.sendShutdownRequest()}} won't be called. Additionally, 
when enabling Standalone mode, GobblinClusterManager will call the following 
code, which doesn't seem right if I'm running under AWS:

{code:java}
 // In AWS / Yarn mode, the cluster Launcher takes care of setting up Helix 
cluster
/// .. but for Standalone mode, we go via this main() method, so setup the 
cluster here
if (isStandaloneClusterManager) {
// Create Helix cluster and connect to it
String zkConnectionString = 
config.getString(GobblinClusterConfigurationKeys.ZK_CONNECTION_STRING_KEY);
String helixClusterName = 
config.getString(GobblinClusterConfigurationKeys.HELIX_CLUSTER_NAME_KEY);
HelixUtils.createGobblinHelixCluster(zkConnectionString, helixClusterName, 
false);
LOGGER.info("Created Helix cluster " + helixClusterName);
}
{code}

Thoughts?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (GOBBLIN-310) Skip rerunning completed tasks on mapper reattempts

2017-11-09 Thread Hung Tran (JIRA)

 [ 
https://issues.apache.org/jira/browse/GOBBLIN-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hung Tran reassigned GOBBLIN-310:
-

Assignee: Hung Tran

> Skip rerunning completed tasks on mapper reattempts
> ---
>
> Key: GOBBLIN-310
> URL: https://issues.apache.org/jira/browse/GOBBLIN-310
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Hung Tran
>Assignee: Hung Tran
>
> Subsequent executions of a failed mapper will rerun completed tasks. This can 
> result in duplicate data or errors due to collisions when publishing.
> The state of completed mappers should be recorded and completed mappers 
> should be skipped on subsequent attemps.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (GOBBLIN-310) Skip rerunning completed tasks on mapper reattempts

2017-11-09 Thread Hung Tran (JIRA)

 [ 
https://issues.apache.org/jira/browse/GOBBLIN-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hung Tran updated GOBBLIN-310:
--
Sprint: Apache Gobblin 170905

> Skip rerunning completed tasks on mapper reattempts
> ---
>
> Key: GOBBLIN-310
> URL: https://issues.apache.org/jira/browse/GOBBLIN-310
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Hung Tran
>Assignee: Hung Tran
>
> Subsequent executions of a failed mapper will rerun completed tasks. This can 
> result in duplicate data or errors due to collisions when publishing.
> The state of completed mappers should be recorded and completed mappers 
> should be skipped on subsequent attemps.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)