This is an automated email from the ASF dual-hosted git repository. suvasude pushed a commit to branch 0.15.0 in repository https://gitbox.apache.org/repos/asf/incubator-gobblin.git
commit 195dac1de06201a10f442ca37dc296ce98288c07 Author: suvasude <[email protected]> AuthorDate: Mon Aug 3 13:34:20 2020 -0700 Updated CHANGELOG.md with the change log for 0.15 release. --- CHANGELOG.md | 539 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 539 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index f141955..c1eaed7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,542 @@ +GOBBLIN 0.15.0 +-------------- + +###Created Date: 24/07/2020 + +## HIGHLIGHTS +* Auto-scaling of Gobblin on Yarn. +* New MySQL based DAG state store. +* New FileSystem based Spec Producer and Job Status Retriever for GaaS. +* Search GaaS flow configs using flow properties +* Flow level SLA and flow cancel features in GaaS +* Flow catalog that updates dynamically as FileSystem is modified. +* A FileSystem based Job Configuration Manager. +* Version strategy support and enhancements for distcp of Hive and Config based datasets. +* Zuora Connector for Gobblin. + + +## NEW FEATURES +* [GaaS] [GOBBLIN-1196] search flow configs using flow properties and/or other parameters +* [GaaS] [GOBBLIN-1115] Add flow level data movement authorization in gaas +* [GaaS] [GOBBLIN-1073] Add proxy user and requester quota to GaaS +* [GaaS] [GOBBLIN-847] Flow level sla +* [GaaS] [GOBBLIN-808] implement azkaban flow cancel when dag manager is enabled +* [GaaS] [GOBBLIN-790] DagStateStore MySQL +* [GaaS] [GOBBLIN-781] Skeleton for GaaS DR mode clean transition +* [GaaS] [GOBBLIN-775] Add job level retries for gobblin service +* [GaaS] [GOBBLIN-768] Add MySQL implementation of SpecStore +* [GaaS] [GOBBLIN-756] Add flow catalog that updates when filesystem is modified +* [GaaS] [GOBBLIN-725] Add a mysql based job status retriever +* [GaaS] [GOBBLIN-723] Add support to the LogCopier for copying from multiple source paths +* [GaaS] [GOBBLIN-708] Create SqlDatasetDescriptor for JDBC-sourced datasets +* [GaaS] [GOBBLIN-673] Implement a FS based JobStatusRetriever for GaaS Flows. +* [Cluster] [GOBBLIN-1162] Provide an option to allow slow containers to commit suicide +* [Cluster] [GOBBLIN-1031] Gobblin-on-Yarn locally running Azkaban job skeleton +* [Cluster] [GOBBLIN-789] Implement a FileSystem based SpecProducer. +* [Cluster] [GOBBLIN-762] Add automatic scaling for Gobblin on YARN +* [Cluster] [GOBBLIN-742] Implement a FileSystem based JobConfigurationManager +* [Cluster] [GOBBLIN-737] Add support for Helix quota-based task scheduling +* [Cluster] [GOBBLIN-649] Add task driver cluster +* [Salesforce] [GOBBLIN-865] Add feature that enables PK-chunking in partition +* [Compaction] [GOBBLIN-699] Orc compaction impl. +* [Core] [GOBBLIN-677] Allow early termination of Gobblin jobs based on a predicate on the job progress +* [Distcp] [GOBBLIN-772] [GOBBLIN-724] Implement Schema Comparison Strategy during Disctp +* [Distcp] [GOBBLIN-729] Add version strategy support for HiveCopyDataset +* [Distcp] [GOBBLIN-712] Add version strategy pickup for ConfigBasedDataset distcp workflow +* [Source] [GOBBLIN-716] Add lineage in FileBasedSource +* [Source] [GOBBLIN-628] Zuora Connector +* [Hive Registration] [GOBBLIN-693] Add ORC hive serde manager + +##IMPROVEMENTS +* [Cluster] [GOBBLIN-1209] Provide an option to configure the java tmp dir to the Yarn cache location +* [Cluster] [GOBBLIN-1199] convert seconds to ms because helix api take time in ms +* [Cluster] [GOBBLIN-1192] Commit suicide if Helix Task creation failed after retry +* [Cluster] [GOBBLIN-1191] Reuse Helix instance names when containers are released by Gobblin Application Master +* [Cluster] [GOBBLIN-1184] publish gobblin-cluster-test to artifactory +* [Cluster] [GOBBLIN-1183] Enable additional yarn class path set for app master +* [Cluster] [GOBBLIN-1178] Propagation of exception from task creation to helix and request for new container +* [Cluster] [GOBBLIN-1177] Provide a config for overprovisioning Gobblin Yarn containers by a configurable amount +* [Cluster] [GOBBLIN-1175] Provide an option to all GobblinYarnAppLauncher to detach from Yarn application +* [Cluster] [GOBBLIN-1165] Add config to enable user to set additional yarn classpathes +* [Cluster] [GOBBLIN-1152] enable helix instance only if it is a participant +* [Cluster] [GOBBLIN-1141] add support for common job properties in helix job scheduler +* [Cluster] [GOBBLIN-1136] Make LogCopier be able to refresh FileSystem for long running job use cases +* [Cluster] [GOBBLIN-1122] Bump up helix-lib version +* [Cluster] [GOBBLIN-1120] Reinitialize HelixManager when Helix participant check throws an exception +* [Cluster] [GOBBLIN-1107] Lazily initialize Helix TaskStateModelFactory in GobblinTaskRunner +* [Cluster] [GOBBLIN-1099] Handle orphaned Yarn containers in Gobblin-on-Yarn clusters +* [Cluster] [GOBBLIN-1078][RETRY TASK INITIALIZATION] Adding condition to ensure cancellation happened after run +* [Cluster] [GOBBLIN-1076] Make Gobblin cluster working directories configurable +* [Cluster] [GOBBLIN-1072] Being more conservative on releasing containers +* [Cluster] [GOBBLIN-1071] Retry task initialization +* [Cluster] [GOBBLIN-1052] Create a spec consumer path if it does not exist in FS SpecConsumer +* [Cluster] [GOBBLIN-1048] Provide an option to pass and set System properties via Gobblin Cluster application config +* [Cluster] [GOBBLIN-1047] Add Helix and Yarn container metadata to all task events emitted by Gobblin Helix tasks +* [Cluster] [GOBBLIN-1044] Enrich fork-failure information when task failed +* [Cluster] [GOBBLIN-1043] Implement a Helix assigned participant check as a CommitStep +* [Cluster] [GOBBLIN-1036] Add hadoop override configurations when instantiating FileSystem object in GobblinTaskRunner and GobblinClusterManager +* [Cluster] [GOBBLIN-1032] Provide Helix instance tags config to GobblinYarnTaskRunner +* [Cluster] [GOBBLIN-1030] Refactor AbstractYarnSecurityManager to expose method for sending token file update message +* [Cluster] [GOBBLIN-1029] Maintain the last GC stats to accurately report the difference in each interval +* [Cluster] [GOBBLIN-1018] Report GC counts and durations from Gobblin containers metrics service +* [Cluster] [GOBBLIN-1016] Allow Gobblin Application Master to join Helix cluster in PARTICIPANT mode when Helix cluster is managed +* [Cluster] [GOBBLIN-996] Add support for managed Helix clusters for Gobblin-on-Yarn applications +* [Cluster] [GOBBLIN-973] Increase timeout for copying Gobblin workunits to the workunit state store in GobblinHelixJobLauncher +* [Cluster] [GOBBLIN-967] Change token refresh method in YarnContainerSecirityManager +* [Cluster] [GOBBLIN-916] Make ContainerLaunchContext instantiation in YarnService more efficient +* [Cluster] [GOBBLIN-913] Add MySQL and configurations to cluster +* [Cluster] [GOBBLIN-904] Provide an option to reuse an existing Helix cluster on Gobblin-Yarn application launch +* [Cluster] [Gobblin-902] Enable gobblin yarn app luncher class configurable +* [Cluster] [GOBBLIN-875] Emit container health metrics when running in cluster mode +* [Cluster] [GOBBLIN-846] Enhance LogCopier service to handle continuous YARN log aggregation +* [Cluster] [GOBBLIN-836] Expose container logs location via system property to be used in log4j configuration for Gobblin-on-Yarn applications +* [Cluster] [GOBBLIN-817] Implement a workaround for Helix Workflow being stuck in STOPPING state. +* [Cluster] [GOBBLIN-834] Provide config for setting ACLs to control visibility of Gobblin-on-Yarn application logs +* [Cluster] [GOBBLIN-816] Implement a workaround to abort Helix TaskDriver#getWorkflows() after a timeout +* [Cluster] [GOBBLIN-798] Clean up workflows from Helix when the Gobblin application master starts +* [Cluster] [GOBBLIN-795] Make JobCatalog optional for FsJobConfigurationManager +* [Cluster] [GOBBLIN-780] Handle scenarios that cause the YarnAutoScalingManager to be stuck +* [Cluster] [GOBBLIN-777] Remove container request after container allocation +* [Cluster] [GOBBLIN-776] Add a utility method to return Helix WorflowId given a Gobblin job name. +* [Cluster] [GOBBLIN-770] Add JVM configuration to avoid exhausting YARN container memory +* [Cluster] [GOBBLIN-762] Add automatic scaling for Gobblin on YARN +* [Cluster] [GOBBLIN-744] Support cancellation of a Helix workflow via a DELETE Spec. +* [Cluster] [GOBBLIN-742] Implement a FileSystem based JobConfigurationManager. +* [Cluster] [GOBBLIN-743] Initialize Gobblin application master services with dynamic config +* [Cluster] [GOBBLIN-739] Add a way to propagate the Azkaban job config to Gobblin on YARN +* [Cluster] [GOBBLIN-737] Add support for Helix quota-based task scheduling +* [Cluster] [GOBBLIN-732] Pass UGI credentials to the app master and load dynamic config in workers +* [Cluster] [GOBBLIN-720] Always delete state store +* [Cluster] [GOBBLIN-723] Add support to the LogCopier for copying from multiple source paths +* [Cluster] [GOBBLIN-703] Allow planning job to be running in non-blocking mode +* [Cluster] [GOBBLIN-679] Refactor GobblinHelixTask metrics +* [Cluster] [GOBBLIN-655] Allow helix job to have a job type. +* [Cluster] [GOBBLIN-652] Add helix metrics +* [Cluster] [GOBBLIN-649] Add task driver cluster +* [Cluster] [GOBBLIN-647] Move early stop logic to task driver instance. +* [Standalone] [GOBBLIN-903] Initialize docker file for gobblin-standalone and fix docker compose +* [Standalone] [GOBBLIN-707] rewrite gobblin script to combine all modes and command +* [Standalone] [GOBBLIN-883] Add docker files and compose +* [GaaS] [GOBBLIN-1198] status cleaner +* [GaaS] [GOBBLIN-1168] add metrics in all SpecStore implementations +* [GaaS] [GOBBLIN-1154] Improve gaas error messages +* [GaaS] [GOBBLIN-1150] spec catalog table schema change +* [GaaS] [GOBBLIN-1149] Abstract out method for constructing descriptor from config +* [GaaS] [GOBBLIN-1144] remove specs from gobblin service job scheduler +* [GaaS] [GOBBLIN-1137] Add API for getting list of proxy users from an azkaban project +* [GaaS] [GOBBLIN-1135] added back flow remove feature for spec executors when dag manager is not enabled codesyle changes +* [GaaS] [GOBBLIN-1132] move the logic of requester list verification to RequesterService implementation +* [GaaS] [GOBBLIN-1130] Add API for adding proxy user to azkaban project +* [GaaS] [GOBBLIN-1125] Add metrics to measure job status state store performance in Gobblin Service +* [GaaS] [GOBBLIN-1123] Report orchestration delay for Gobblin Service flows +* [GaaS] [GOBBLIN-1105] some refactoring and make MysqlJobStatusStateStore implements DatasetStateStore +* [GaaS] [GOBBLIN-1090] send compiled_skip metrics +* [GaaS] [GOBBLIN-1086] Add job orchestrated time, use job start/prepare time to set job start time in GaaS jobs +* [GaaS] [GOBBLIN-1084] Refresh flowgraph when templates are modified +* [GaaS] [GOBBLIN-1082] compile a flow before storing it in spec catalog +* [GaaS] [GOBBLIN-1075] Add option to return latest failed flows +* [GaaS] [GOBBLIN-1074] Sort job status array when returning flow status +* [GaaS] [GOBBLIN-1067] Add SFTP DataNode type in Gobblin-as-a-Service (GaaS) FlowGraph +* [GaaS] [GOBBLIN-1051] Emit Helix Leader Metrics +* [GaaS] [GOBBLIN-1050] Verify requester when updating/deleting FlowConfig +* [GaaS] [GOBBLIN-1038] Set default dataset descriptor configs based on the DataNode +* [GaaS] [GOBBLIN-1035][Gobblin-1035] make hive dataset descriptor accepts regexed db and tables +* [GaaS] [GOBBLIN-1027] add metrics for users running gaas jobs +* [GaaS] [GOBBLIN-1017] Deprecate FlowStatus in favor of FlowExecution and add endpoint to kill flows +* [GaaS] [GOBBLIN-1003] HiveDataNode node addition to support adl and abfs URI for gobblin-Service +* [GaaS] [GOBBLIN-988] Implement LocalFSJobStatusRetriever +* [GaaS] [GOBBLIN-958] make hive flow edge accept multiple tables +* [GaaS] [GOBBLIN-953] Add scoped config for app launcher created by gobblin service +* [GaaS] [GOBBLIN-948] add hive data node and descriptor +* [GaaS] [GOBBLIN-946] Add HttpDatasetDescriptor and HttpDataNode to Gobblin Service +* [GaaS] [GOBBLIN-932] Create deployment for Azure, clean up existing deployments +* [GaaS] [GOBBLIN-925] Create option to log outputs to console, fix docker-compose +* [GaaS] [GOBBLIN-917] kill orphan gaas jobs +* [GaaS] [GOBBLIN-914] accept more tracking events in gaas +* [GaaS] [GOBBLIN-906] Initializes kubernetes cluster for GaaS and Gobblin Standalone +* [GaaS] [GOBBLIN-897] adds local FS spec executor to write jobs to a local dir +* [GaaS] [GOBBLIN-894] Add option to combine datasets into a single flow +* [GaaS] [GOBBLIN-882] Modify application config so that GaaS runs +* [GaaS] [GOBBLIN-881] Add job tag field that can be used to filter job statuses +* [GaaS] [GOBBLIN-870] Adding abfs scheme +* [GaaS] [GOBBLIN-860] Process flow-level events for setting/retrieving flow status +* [GaaS] [GOBBLIN-856] make job status monitor a top level service +* [GaaS] [GOBBLIN-855] persist dag after addspec +* [GaaS] [GOBBLIN-853] Support multiple paths specified in flow config +* [GaaS] [GOBBLIN-837] refactor FlowConfigV2Client and FlowStatusClient to allow child classes modify respective RequestBuilders +* [GaaS] [GOBBLIN-828] Make dynamic config override job config +* [GaaS] [GOBBLIN-810] Include flow edge ID in job name +* [GaaS] [GOBBLIN-796] Add support partial updates for flowConfig +* [GaaS] [GOBBLIN-793] Separate SpecSerDe from SpecCatalogs and add GsonSpecSerDe +* [GaaS] [GOBBLIN-792] submit a GobblinTrackingEvent when jobs are compiled but not yet orchestrated +* [GaaS] [GOBBLIN-782] Add dynamic config to JobSpec +* [GaaS] [GOBBLIN-781] Skeleton for GaaS DR mode clean transition +* [GaaS] [GOBBLIN-790] DagStateStore MySQL +* [GaaS] [GOBBLIN-786] Separate SerDe library in DagStateStore out for GaaS-wide sharing +* [GaaS] [GOBBLIN-779] make job status retriever configurable +* [GaaS] [GOBBLIN-775] Add job level retries for gobblin service +* [GaaS] [GOBBLIN-773] handle job cancellation case in status monitor +* [GaaS] [GOBBLIN-771] add a few metrics for gobblin service +* [GaaS] [GOBBLIN-768] Add MySQL implementation of SpecStore +* [GaaS] [GOBBLIN-765] Remove a duplicate leading period character from the config key for SqlDataNode +* [GaaS] [GOBBLIN-756] Add flow catalog that updates when filesystem is modified +* [GaaS] [GOBBLIN-748] Craftsmanship code cleaning in Gobblin Service Code +* [GaaS] [GOBBLIN-746] Async loading FlowSpec +* [GaaS] [GOBBLIN-730] added job start and end time in flow status retriever +* [GaaS] [GOBBLIN-731] Make deserialization of FlowSpec more robust +* [GaaS] [GOBBLIN-725] add a mysql based job status retriever +* [GaaS] [GOBBLIN-722] add option to unschedule a flow set schedule even if the job is already scheduled +* [GaaS] [GOBBLIN-720] Always delete state store +* [GaaS] [GOBBLIN-713] Lazy load job specification from job catalog to avoid OOM issue. +* [GaaS] [GOBBLIN-709] Provide an option to disallow concurrent flow executions in Gobblin-as-a-Service +* [GaaS] [GOBBLIN-708] Create SqlDatasetDescriptor for JDBC-sourced datasets. +* [GaaS] [GOBBLIN-698] Enhance logging to print job and flow details when a job is orchestrated by GaaS +* [GaaS] [GOBBLIN-696] Provide an "explain" option to return a compiled flow when a flow config is added. +* [GaaS] [GOBBLIN-692] Add support to query last K flow executions in Gobblin-as-a-Service (GaaS) +* [GaaS] [GOBBLIN-687] Pass TopologySpec map to DagManager to allow reuse of SpecExecutors during DAG deserialization +* [GaaS] [GOBBLIN-683] Add azkaban client retry logic. +* [GaaS] [GOBBLIN-681] increase max size of job name +* [GaaS] [GOBBLIN-688] Make FsJobStatusRetriever config more scoped. +* [GaaS] [GOBBLIN-678] Make flow.executionId available in the GaaS Flow config for use in job templates. +* [GaaS] [GOBBLIN-675] Enhance FSDatasetDescriptor definition to include partition config, encryption level and compaction config. +* [GaaS] [GOBBLIN-673] Implement a FS based JobStatusRetriever for GaaS Flows. +* [GaaS] [GOBBLIN-667] Pass encrypt.key.loc configuration to GitFlowGraphMonitor. +* [GaaS] [GOBBLIN-664] Refactor Azkaban Client for session refresh. +* [GaaS] [GOBBLIN-662] Enhance SSH-based access to Git to enable/disable host key checking. +* [GaaS] [GOBBLIN-658] Submit a JobFailed event when an exception is encountered during Job orchestration in Gobblin service. +* [GaaS] [GOBBLIN-653] Create JobSucceededTimer tracking event to accurately track successful Gobblin jobs. +* [GaaS] [GOBBLIN-646] Refactor MultiHopFlowCompiler to use SpecExecutor configs from TopologySpecMap. +* [GaaS] [GOBBLIN-644] Add metrics reporting config dynamically to compiled flows in MultiHopFlowCompiler. +* [GaaS] [GOBBLIN-639] Change method to static for RequesterService serder +* [GaaS] [GOBBLIN-638] Submit more timing events from GaaS to accurately track flow/job status. +* [GaaS] [GOBBLIN-636] Use FS scheme and relative URIs for specifying job template locations in GaaS. +* [Compaction] [GOBBLIN-1214] Move the fallback of in-eligible shuffleKey to driver +* [Compaction] [GOBBLIN-1201] Add datset.urn in GTE for MRCompactionTask +* [Compaction] [GOBBLIN-1190] Fallback to full schema if configured shuffle schema is not available +* [Compaction] [GOBBLIN-1126] Make ORC compaction shuffle key configurable +* [Compaction] [GOBBLIN-1133] Add CompactionSuiteBaseWithConfigurableCompleteAction to make complete action configurable +* [Compaction] [GOBBLIN-1045] Emit more events in compaction job +* [Compaction] [GOBBLIN-1012] Implement CompactionWithWatermarkSuite +* [Compaction] [GOBBLIN-763] Support fields removal for compaction dedup key schema +* [Compaction] [GOBBLIN-848] Make initialization of CompactionSource extensible with certain protection +* [Compaction] [GOBBLIN-699] Orc compaction impl. +* [Compaction] [GOBBLIN-691] Make format-specific component pluggable in compaction +* [Compaction] [GOBBLIN-1011] adjust compaction flow to work with virtual partition +* [Compaction] [GOBBLIN-1158] Use input dir to document old files instead of file pathes to reduce memory cost in Compaction configurator +* [Compaction] [GOBBLIN-1117] Enable record count verification for ORC format +* [Compaction] [GOBBLIN-884] Support ORC schema evolution across mappers in MR mode +* [Hive Registration] [GOBBLIN-1206] Only populate path to dest-table if src-table has it as storageParam +* [Hive Registration] [GOBBLIN-1145] add path in serde props +* [Hive Registration] [GOBBLIN-1006] Enable configurable case-preserving and schema source-of-truth in table level properties +* [Hive Registration] [GOBBLIN-993] Support job level hive configuration override +* [Hive Registration] [GOBBLIN-986] persist the existing property of iceberg +* [Hive Registration] [GOBBLIN-954] Added support to swap different HiveRegistrationPublishers +* [Hive Registration] [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema +* [Hive Registration] [GOBBLIN-912] Enable TTL caching on Hive Metastore client connection +* [Hive Registration] [GOBBLIN-877] Add column metadata for partition for inline hive registration +* [Hive Registration] [GOBBLIN-861] Skip getPartition() call to Hive Metastore when a partition already exists +* [Hive Registration] [GOBBLIN-851] Provide capability to disable Hive partition schema registration. +* [Hive Registration] [GOBBLIN-852] Reorganize the code for hive registration to isolate function +* [Hive Registration] [GOBBLIN-753] Refactor HiveRegistrationPolicyBase to surface configStore object +* [Hive Registration] [GOBBLIN-705] create method to merging tblProps from existing hive meta table +* [Hive Registration] [GOBBLIN-704] Add serde attributes for orc +* [Hive Registration] [GOBBLIN-693] Add ORC hive serde manager +* [Hive Registration] [GOBBLIN-1148] improve hive test coverage +* [Hive Registration] [GOBBLIN-921] Make pull/push mode when registering partition to be configurable +* [Hive Registration] [GOBBLIN-893] Make format-check in ORC-registration optional and by-default disabled +* [Distcp] [GOBBLIN-1216] Embedded Hive Distcp +* [Distcp] [GOBBLIN-1203] Adding configurations for staging directory in Embedded Distcp template +* [Distcp] [GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables +* [Distcp] [GOBBLIN-1057] Optimize unnecessary RPCs in distcp-ng +* [Distcp] [GOBBLIN-1001] Implement TimePartitionGlobFinder +* [Distcp] [GOBBLIN-962] Refactor RecursiveCopyableDataset. +* [Distcp] [GOBBLIN-961] Bypass locked directories when calculating src watermark +* [Distcp] [GOBBLIN-910] Added a unix timestamp recursive copyable dataset finder +* [Distcp] [GOBBLIN-899] Add config in replication config to determine wheter schema cehck enable for the dataset +* [Distcp] [GOBBLIN-888] Make yyyy-MM-dd-HH-mm recognizable in TimeAwareRecursiveCopyableDataset +* [Distcp] [GOBBLIN-784] Allow setting replication factor in distcp +* [Distcp] [GOBBLIN-772] Implement Schema Comparison Strategy during Disctp +* [Distcp] [GOBBLIN-751] Make enforced file size matching to be configurable +* [Distcp] [GOBBLIN-729] Add version strategy support for HiveCopyDataset +* [Distcp] [GOBBLIN-726] Enable schema check +* [Distcp] [GOBBLIN-712] Add version strategy pickup for ConfigBasedDataset distcp workflow +* [Distcp] [GOBBLIN-697] Implementation of data file versioning and preservation in distcp. +* [Distcp] [GOBBLIN-598] Add documentation on split enabled distcp (config glossary & gobblin distcp page) +* [Kafka] [GOBBLIN-1143] Add a generic wrapper producer client to communicate with Kafka +* [Kafka] [GOBBLIN-1112] Implement a new HttpMethodRetryHandler that allows retrying a HTTP method on transient network errors +* [Kafka] [GOBBLIN-1064] Make KafkaAvroSchemaRegistry extendable +* [Kafka] [GOBBLIN-1040] HighLevelConsumer re-design +* [Kafka] [GOBBLIN-970] Pass metric context from the KafkaSource to the KafkaWorkUnitPacker for emission of metrics from the packer +* [Kafka] [GOBBLIN-886] add callback to kafka apis +* [Kafka] [GOBBLIN-857] Extending getTopicsFromConfigStore to accept topicName directly +* [Kafka] [GOBBLIN-684] Ensure buffered messages are flushed before close() in KafkaProducerPusher +* [Kafka] [GOBBLIN-651] Ensure ordered delivery of Kafka events from KeyValueProducerPusher for kafka-08. +* [Kafka] [GOBBLIN-650] Ensure ordered delivery of Kafka events from KeyValueProducerPusher. +* [Kafka] [GOBBLIN-642] Implement KafkaAvroEventKeyValueReporter +* [Kafka] [GOBBLIN-640] Add a Kafka producer pusher that supports keyed messages +* [Avro-to-ORC] [GOBBLIN-1046] Make /final subdir configurable in ORC-conversion output +* [Avro-to-ORC] [GOBBLIN-1024] Supporting Avro logical type recognition in Avro-to-ORC transformation +* [Avro-to-ORC] [GOBBLIN-999] Separate Hive-Avro type related constants out of Avro2ORC specific module to make it re-usable +* [Avro-to-ORC] [GOBBLIN-975] Add flag to enable/disable avro type check in AvroToOrc +* [Avro-to-ORC] [GOBBLIN-755] add delimiter to hive queries +* [Salesforce] [GOBBLIN-1202] Add retry for REST API call +* [Salesforce] [GOBBLIN-1186] explicitly set source.querybased.salesforce.is.soft.deletes.pull.disabled for simple mode +* [Salesforce] [GOBBLIN-1179] Add typed config in salesforce +* [Salesforce] [GOBBLIN-1101] Enhance bulk api retry for ExceedQuota +* [Salesforce] [GOBBLIN-1025] Add retry for PK-Chuking iterator +* [Salesforce] [GOBBLIN-995] Add function to instantiate the BulkConnection in SFDC connector +* [Salesforce] [GOBBLIN-862] Security token encryption support in SFDC connector +* [Salesforce] [GOBBLIN-813] Make SFDC connector support encrypted Salesforce client id and client secret +* [Salesforce] [GOBBLIN-778] [GOBBLIN-551] Moving config creation to a separate method +* [Global Throttling] [GOBBLIN-764] Allow injection of Rest.li configurations for throttling client and fixed unit test. +* [Global Throttling] [GOBBLIN-760] Improve retrying behavior of throttling client +* [Global Throttling] [GOBBLIN-749] Add logging to limiter server. +* [Global Throttling] [GOBBLIN-724] Throttling server delays responses for throttling causing too many connections +* [Source] [GOBBLIN-1174] Fail job on FileBasedSource ls invalid source directory +* [Source] [GOBBLIN-1056] Refactor to allow customizing client pool population in KafkaSource +* [Source] [GOBBLIN-1054] Refactor HiveSource to make partition filter extensible +* [Source] [GOBBLIN-879] Refactor bin-packer for better code reuse +* [Source] [GOBBLIN-874] Make WorkUnitPacker and SizeEstimator pluggable +* [Source] [GOBBLIN-738] Open a way to customize decoding KafkaConsumerRecord +* [Source] [GOBBLIN-716] Add lineage in FileBasedSource +* [Extractor] [GOBBLIN-1207] Clear references to potentially large objects in Fork, FileBasedExtractor, and HiveWritableHdfsDataWriter +* [Extractor] [GOBBLIN-1100] Set average fetch time in the KafkaExtractor even when metrics are disabled +* [Extractor] [GOBBLIN-1087] Track and report histogram of observed lag from Gobblin Kafka pipeline +* [Extractor] [GOBBLIN-1079] set extract.is.full property +* [Extractor] [GOBBLIN-1058] Refactor method emitting GTE for ease of adding new tags +* [Extractor] [GOBBLIN-1000] Add min and max LogAppendTime to tracking events emitted from Gobblin Kafka Extractor +* [Extractor] [GOBBLIN-989] Track and report record level SLA in Gobblin Kafka Extractor tracking event +* [Extractor] [GOBBLIN-955] Expose a method to get average record size in KafkaExtractorStatsTracker +* [Extractor] [GOBBLIN-945] Refactor Kafka extractor statistics tracking to allow code reuse across both batch and streaming execution modes +* [Extractor] [GOBBLIN-915] Allow user customize the Extract timezone. +* [Extractor] [GOBBLIN-890] Makeing ExtractID timeZone Configurable +* [Extractor] [GOBBLIN-887] Generialize UniversalKafkaSource to accept Extractor that not extending KafkaExtractor +* [Extractor] [GOBBLIN-876] Expose metrics() API in GobblinKafkaConsumerClient to allow consume metrics to be reported +* [Extractor] [GOBBLIN-873] Add offset look-back option in Kafka consumer +* [Extractor] [GOBBLIN-778][GOBBLIN-551] Moving config creation to a separate method +* [Extractor] [GOBBLIN-738] Open a way to customize decoding KafkaConsumerRecord +* [Extractor] [GOBBLIN-717] Filter Out Empty MultiWorkUnits +* [Extractor] [GOBBLIN-706] Enable dynamic mappers +* [Extractor] [GOBBLIN-833] Make SFTP connection timeout-table +* [Converter] [GOBBLIN-1080] Add configuration to add schema creation time in converter +* [Converter] [GOBBLIN-1081] Adding support of timestamp data type for CsvToJsonConverter +* [Converter] [GOBBLIN-1066] field projection with namespace +* [Converter] [GOBBLIN-983] use java string library for string format +* [Converter] [GOBBLIN-957] Add recursion eliminating code, converter for Avro +* [Converter] [GOBBLIN-933] add support for array of unions in json schema_new +* [Converter] [GOBBLIN-896] Clone schema and field props in AvroSchemaFieldRemover +* [Converter] [GOBBLIN-757] Adding utility functions to support decoration of Avro Generic Records +* [Converter] [GOBBLIN-755] Add delimiter to hive queries +* [Converter] [GOBBLIN-733] Instrument Avro Converters to allow converter metrics emission in both batch and streaming modes. +* [Converter] [GOBBLIN-686] Enhance schema comparison +* [Converter] [GOBBLIN-676] Add record metadata support to the RecordEnvelope +* [Quality Checker] [GOBBLIN-1119] Enable close-on-flush for quality-checker's err-file +* [Quality Checker] [GOBBLIN-1089] Refactor policyChecker for extensibility +* [Quality Checker] [GOBBLIN-971] Enable speculative execution awareness for RowQualityChecker +* [Writer] [GOBBLIN-1181] Make parquet-proto compileOnly dependency +* [Writer] [GOBBLIN-1155] Make socket connect timeout configurable for couchbase writer +* [Writer] [GOBBLIN-1147] Use one dfsClient in FsDataWriter to to rename and exists check to avoid inconsistency +* [Writer] [GOBBLIN-1146] Allow configuring autocommit in JDBCWriters +* [Writer] [GOBBLIN-1015] Adding support for direct Avro and Protobuf writes in Parquet format +* [Writer] [GOBBLIN-1008] Upgrading parquet dependency to org.apache.parquet. Fixing tests +* [Writer] [GOBBLIN-928] Craftsmanship cleaning and bumping up ORC version +* [Writer] [GOBBLIN-911] Make profiling of HiveWritableHdfsDataWriter easier by injecting jobConf +* [Writer] [GOBBLIN-880] Bump CouchbaseWriter Couchbase SDK version + write docs + cert based auth + enable TTL + dnsSrv +* [Writer] [GOBBLIN-859] let writer pass latest schema to WorkUnitState +* [Writer] [GOBBLIN-820] Add keyed write capability to Kafka writer +* [Writer] [GOBBLIN-769] Support string record timestamp in TimeBasedAvroWriterPartitioner +* [Writer] [GOBBLIN-767] Support different time units in TimeBasedWriterPartitioner +* [Writer] [GOBBLIN-736] Skip flush and control message handlers on closed writers in the CloseOnFlushWriterWrapper +* [Writer] [GOBBLIN-727] Skip commit in CloseOnFlushWriterWrapper if a commit has already been invoked on the underlying writer. +* [Writer] [GOBBLIN-695] Adding utility functions to generate Avro/ORC binary using json +* [Writer] [GOBBLIN-630] Add a concrete implementation for Postgres writer +* [Core] [GOBBLIN-1217] start metrics reporting with a few map-reduce properties +* [Core] [GOBBLIN-1189] Relax the condition for the increasing ingestion latency check +* [Core] [GOBBLIN-1049] Move workunit commit logic to the end of publish(). +* [Core] [GOBBLIN-774] Send nack when a control message handler fails in Fork +* [Core] [GOBBLIN-721] Remove additional ack. Simplify watermark manager +* [Core] [GOBBLIN-706] enable dynamic mappers +* [Core] [GOBBLIN-677] Allow early termination of Gobblin jobs based on a predicate on the job progress +* [Core] [GOBBLIN-676] Add record metadata support to the RecordEnvelope +* [Core] [GOBBLIN-653] Create JobSucceededTimer tracking event to accurately track successful Gobblin jobs. +* [Runtime] [GOBBLIN-1041] send metrics for workunit creation time +* [Runtime] [GOBBLIN-992] Make parallelRunner timeout configurable in MRJobLauncher +* [Runtime] [GOBBLIN-976] Add dynamic config to the state before instantiating metrics reporter in MRJobLauncher +* [Runtime] [GOBBLIN-964] Add the enum JOB_SUCCEEDED to org.apache.gobblin.metrics.event.EventName] +* [Runtime] [GOBBLIN-938] Make job-template resolution available in all JobLaunchers +* [Runtime] [GOBBLIN-908] Customized Progress to enable speculative execution +* [Runtime] [GOBBLIN-766] Emit WorkUnitsCreated Count Event for MR deployed jobs. +* [Runtime] [GOBBLIN-864] add job error message in job state +* [Runtime] [GOBBLIN-787] Add an option to include the task start time in the output file name +* [Runtime] [GOBBLIN-766] Emit Workunits Created event +* [Runtime] [GOBBLIN-774] Send nack when a control message handler fails in Fork +* [Runtime] [GOBBLIN-713] Lazy load job specification from job catalog to avoid OOM issue. +* [Runtime] [GOBBLIN-685] Add dump jstack for EmbeddedGobblin +* [Gobblin Metrics] [GOBBLIN-1127] Provide an option to make metric reporting instantiation failure fatal +* [Gobblin Metrics] [GOBBLIN-1116] Avoid registering schema with schema registry during MetricReporting initialization from cluster workers +* [Gobblin Metrics] [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry +* [Gobblin Metrics] [GOBBLIN-802] change gauge metrics context to RootMetricsContext +* [Gobblin Metrics] [GOBBLIN-758] Added new reporters to emit MetricReport and GobblinTrackingEvent without serializing them. Also added random key generator for reporters. +* [Gobblin Metrics] [GOBBLIN-827] Add more events +* [Gobblin Metrics] [GOBBLIN-807] TimingEvent is now closeable, extends GobblinEventBuilder +* [Util] [GOBBLIN-757] Adding utility functions to support decoration of Avro Generic Records +* [Util] [GOBBLIN-695] Adding utility functions to generate Avro/ORC binary using json +* [Job Templates] [GOBBLIN-960] Resolving multiple templates in top-level +* [Job Templates] [GOBBLIN-701] Add secure templates (duplicate of #2571) +* [Retention] [GOBBLIN-1185] Enable dataset cleaner to emit kafka events +* [Retention] [GOBBLIN-806] Enable metrics reporter during dataset discovery for retention job +* [Retention] [GOBBLIN-682] Create a new constructor for DatasetCleanerJob. +* [MySQL] [GOBBLIN-1108] bump up mysql-connector +* [Config] [GOBBLIN-1157] get a json representation of object if config type is different +* [State Store] [GOBBLIN-1151] use gson in place of jackson for serialize/deserialize +* [Config Store] [GOBBLIN-761] Only instantiate topic-specific configStore object when topic.name is available +* [Embedded] [GOBBLIN-685] Add dump jstack for EmbeddedGobblin +* [Apache] [GOBBLIN-1215] adding travis retry +* [Apache] [GOBBLIN-1159] Added code to publish gobblin artifacts to bintray +* [Apache] [GOBBLIN-1172] Migrate to Ubuntu 18 with openjdk8 for Travis +* [Apache] [GOBBLIN-641] Reserve version 0.15.0 for next release +* [Build] [GOBBLIN-735] Relocate all google classes to cover protobuf and guava dependency in orc-dep jar +* [Build] [GOBBLIN-1176] create gobblin-all module resolving full dependency tree +* [Build] [GOBBLIN-829] Executing jacocReport before uploading to codecov +* [Build] [GOBBLIN-821] Adding Codecov +* [Documentation] [GOBBLIN-1094] Added documentation of High level consumer +* [Documentation] [GOBBLIN-1053] Update Readme with new badges and links +* [Documentation] [GOBBLIN-669] Configuration Properties Glossary section of Docs hard to read +* [Documentation] [GOBBLIN-598] Add documentation on split enabled distcp (config glossary & gobblin distcp page) + +##BUG FIXES +* [Bug] [GOBBLIN-1220] Log improvement +* [Bug] [GOBBLIN-1219] do not schedule flow spec from slave instance of GobblinServiceJobScheduler +* [Bug] [GOBBLIN-1218] deserialize flow config properties using old method for backward compatibility +* [Bug] [GOBBLIN-1212] Handle non-Primitive type eligibility-check for shuffle key properly +* [Bug] [GOBBLIN-1210][GOBBLIN-1217] Force AM to read from token file to update token when start up (replace PR) +* [Bug] [GOBBLIN-1213] Skip job state deserialization during SingleFailInCreationTask instantiation +* [Bug] [GOBBLIN-1211] Track unused Helix instances in a thread-safe manner +* [Bug] [GOBBLIN-1208] Fix - restApiRetryLimit cannot be set to 0 +* [Bug] [GOBBLIN-1200] Fix bug when local network throttling distcp jobs +* [Bug] [GOBBLIN-1197] Attempting resolving race condition among different tests' port allocation +* [Bug] [GOBBLIN-1193] Ensure that ingestion latency is 0 when no records are consumed by Kafka Extractor +* [Bug] [GOBBLIN-1188] fix log message for SFDC iterators +* [Bug] [GOBBLIN-1180] Removed dependencies on gobblin-parquet +* [Bug] [GOBBLIN-1170] Add missing booleanWritable type +* [Bug] [GOBBLIN-1173] Prevent propagating exceptions from the finally block in KafkaSource#getWorkUnits +* [Bug] [GOBBLIN-1160] No spec delete on gobblin service start +* [Bug] [GOBBLIN-1163] Fix travis formatting error +* [Bug] [GOBBLIN-1124] Add exception error message. +* [Bug] [GOBBLIN-1121] Fix Issue that YarnService use the old token to acquire new container +* [Bug] [GOBBLIN-1118] Bump up ORC version to 1.6.2 to pick up ORC-569 +* [Bug] [GOBBLIN-1113] Carry forward requester list property when updating flowconfig +* [Bug] [GOBBLIN-1114] OrcValueMapper schema evolution up-conversion recursive +* [Bug] [GOBBLIN-1111] CsvToJsonConverterV2 should not print out raw data in the log +* [Bug] [GOBBLIN-1110] fix deadlock in job cancellation replacing deprecated class MessageHandlerFactory with MultiTypeMessageHandlerFactory +* [Bug] [GOBBLIN-1109] partial rollback of PR#2836 +* [Bug] [GOBBLIN-1106] do not remove requester list +* [Bug] [GOBBLIN-1102] Add link to GIP +* [Bug] [GOBBLIN-1100] Change access modifier for generateTagsForPartitions to accomodate with +* [Bug] [GOBBLIN-1098] Remove commons-lang and slf4j from the orc-dep fat jar +* [Bug] [GOBBLIN-1097] ResultChainingIterator.add should check if the argument iterator is null +* [Bug] [GOBBLIN-1096] Work with DST change in compaction watermark +* [Bug] [GOBBLIN-1092] added some logs, fix checkstyle, removed some redundant code +* [Bug] [GOBBLIN-1091] Pass Yarn application id as part of AppMaster and YarnTaskRunner's start up command +* [Bug] [GOBBLIN-1088] Don't lowercase partition pattern config +* [Bug] [GOBBLIN-1085] fix compaction initialization +* [Bug] [GOBBLIN-1077] Fix bug in HiveDataset.resolveConfig +* [Bug] [GOBBLIN-1069] Add NPE check in handleContainerCompletion method +* [Bug] [GOBBLIN-1065] Fix SSL verification issue for macOS +* [Bug] [GOBBLIN-1063] add log +* [Bug] [GOBBLIN-1062] Add log when loading dags from state store +* [Bug] [GOBBLIN-1060] Fix wrong fileSystem object in YarnApplauncher +* [Bug] [GOBBLIN-1042] Fix ForkMetric incorrect return type of parent metric object and relevant unit tests +* [Bug] [GOBBLIN-1037] Disable UnixTimestampRecursiveCopyableDatasetTest +* [Bug] [GOBBLIN-1034] Ensure underlying writers are expired from the PartitionedDataWriter cache to avoid accumulation of writers for long running Gobblin jobs +* [Bug] [GOBBLIN-1019] Change jcenter url to https +* [Bug] [GOBBLIN-1013] Fix prepare_release_config build step on Windows +* [Bug] [GOBBLIN-1014] Fix error handling in gobblin.sh +* [Bug] [GOBBLIN-1002] Set state id when deserializing state from Gobblin state store +* [Bug] [GOBBLIN-998] ExecutionStatus should be reset to PENDING before a job retries +* [Bug] [GOBBLIN-997] Add serialVersionUID to FlowSpec for backwards compatibility +* [Bug] [GOBBLIN-994] fix wrong import org.testng.collections.Lists +* [Bug] [GOBBLIN-990] Don't allow creation of flow config that already exists +* [Bug] [GOBBLIN-987] Reject unrecognized Enum symbols in JsonRecordAvroSchemaToAvroConverter +* [Bug] [GOBBLIN-981] Handle backward compatibility issue in HiveSource +* [Bug] [GOBBLIN-980] Fix AzkabanClient always throwing exception when cancelling flow +* [Bug] [GOBBLIN-978] Use job start time instead of flow start time to kill jobs stuck in ORCHESTRATED state +* [Bug] [GOBBLIN-977] Update the misleading comments in BaseAbstractTask +* [Bug] [GOBBLIN-974] Avoid updating job/flow status if messages arrive out of order +* [Bug] [GOBBLIN-972] Make DEFAULT_NUM_THREADS in DagManager public +* [Bug] [GOBBLIN-969] Bump orc version for bug fixes +* [Bug] [GOBBLIN-966] Check if no partitions have been processed by KafkaExtractor in close() method to avoid ArrayIndexOutOfBoundsException +* [Bug] [GOBBLIN-963] Remove duplicated copies of TaskContext/TaskState when constructing TaskIFaceWrapper +* [Bug] [GOBBLIN-956] Continue loading dags until queue is drained +* [Bug] [GOBBLIN-950] Avoid persisting dag right after loading it on startup +* [Bug] [GOBBLIN-940] Add synchronization on workunit persistency before Helix job launching +* [Bug] [GOBBLIN-937] fix help text and align it with variable names +* [Bug] [GOBBLIN-924] Get rid of orc.schema.literal in ORC-ingestion and registration +* [Bug] [GOBBLIN-923] Fix Array and Map JsonElement converters to handle nullable elements +* [Bug] [GOBBLIN-922] fix start sla time unit conversion issue +* [Bug] [GOBBLIN-919] Using apache commons Pair API +* [Bug] [GOBBLIN-909] Return error message for unresolved substitutions in explain query +* [Bug] [GOBBLIN-905] Fixes issue where newly added jobs would crash in gobblin standalone's job conf folder +* [Bug] [GOBBLIN-895] Fixes Gobblin Standalone configs and scripts so that the user guide is accurate +* [Bug] [GOBBLIN-892] reverting back bad changes done in PR 2720 +* [Bug] [GOBBLIN-891] Fixing Couchbase writer docs +* [Bug] [GOBBLIN-887] Fix the FileContext wrong fsUri issue. +* [Bug] [GOBBLIN-885] Fix orc-Compaction bug in non-dedup mode and add unit-test +* [Bug] [GOBBLIN-872] Only use one CouchbaseEnvironment instane per JVM +* [Bug] [GOBBLIN-868] Check flow status instead of job status to determine if flow is running +* [Bug] [GOBBLIN-863] Handle race condition issue for hive registration +* [Bug] [GOBBLIN-841] make some fields public +* [Bug] [GOBBLIN-840] avoid creating a flow execution id by both master and slave gaas +* [Bug] [GOBBLIN-839] Catch all exceptions when getting flow template at runtime +* [Bug] [GOBBLIN-838] Fix Ivy-based ConfigStoreUtils and add Unit Test +* [Bug] [GOBBLIN-825] Initialize message schema at object construction rather than creating a new instance for every message +* [Bug] [GOBBLIN-809] Fix RateBasedLimitter factory to get double instead of long rate for initialization +* [Bug] [GOBBLIN-805] Fix dag being cleaned twice +* [Bug] [GOBBLIN-804] Fix config member variable not being set +* [Bug] [GOBBLIN-799] Fix bug in AvroSchemaCheckDefaultStrategy +* [Bug] [GOBBLIN-798] Clean up workflows from Helix when the Gobblin application dies +* [Bug] [GOBBLIN-794] fix JsonIntermedidateToAvroConverter for nested array/object use cases +* [Bug] [GOBBLIN-791] Fix hanging stream on error in asynchronous execution model +* [Bug] [GOBBLIN-785] remove wrapper isPartition function, use table.isPartitioned instead +* [Bug] [GOBBLIN-783] Fix the double referencing issue for job type config. +* [Bug] [GOBBLIN-780] Handle scenarios that cause the YarnAutoScalingManager to be stuck +* [Bug] [GOBBLIN-777] Remove container request after container allocation +* [Bug] [GOBBLIN-765] Remove a duplicate leading period character from the config key for SqlDataNode +* [Bug] [GOBBLIN-761] Only instantiate topic-specific configStore object when topic.name is available +* [Bug] [GOBBLIN-754] Clean old version of multi-hop compiler +* [Bug] [GOBBLIN-752] Fix a bug in QPS throttling policy where it was incorrectly indicating permits were impossible to satisfy. +* [Bug] [GOBBLIN-747] Check schema +* [Bug] [GOBBLIN-740] Remove setting retentionPolicy on every Point write +* [Bug] [GOBBLIN-736] Skip flush and control message handlers on closed writers in the CloseOnFlushWriterWrapper +* [Bug] [GOBBLIN-734] Fix speculative safety checking in HiveWritable writer +* [Bug] [GOBBLIN-731] Make deserialization of FlowSpec more robust +* [Bug] [GOBBLIN-727] Skip commit in CloseOnFlushWriterWrapper if a commit has already been invoked on the underlying writer. +* [Bug] [GOBBLIN-726] enable schema check for ticket ETL-8753 +* [Bug] [GOBBLIN-721] Gobblin streaming recipe is broken +* [Bug] [GOBBLIN-719] fix invalid git links for classes in docs +* [Bug] [GOBBLIN-717] Filter Out Empty MultiWorkUnits +* [Bug] [GOBBLIN-702] Compaction fix for reuse of OrcStruct +* [Bug] [GOBBLIN-690] Fix the planning job relaunch name match. +* [Bug] [GOBBLIN-689] catch unchecked exceptions in KafkaSource +* [Bug] [GOBBLIN-684] Ensure buffered messages are flushed before close() in KafkaProducerPusher +* [Bug] [GOBBLIN-680] Enhance error handling on task creation +* [Bug] [GOBBLIN-674] Skip initialization of GitMonitoringService when gobblin template dirs is empty. +* [Bug] [GOBBLIN-671] Close the underlying writer when a HiveWritableHdfsDataWriter is closed +* [Bug] [GOBBLIN-670] Ensure MultiHopFlowCompiler is initialized when job template catalog location is not provided. +* [Bug] [GOBBLIN-667] Pass encrypt.key.loc configuration to GitFlowGraphMonitor. +* [Bug] [GOBBLIN-666] Data too long for column 'property_key' +* [Bug] [GOBBLIN-665] Throw an exception if job orchestration fails on a SpecExecutor. +* [Bug] [GOBBLIN-663] Fix Typesafe config resolution failure for non-HOCON strings +* [Bug] [GOBBLIN-661] Prevent jobs resubmission after manager failure +* [Bug] [GOBBLIN-660] Fix OracleExtractor datatype mapping +* [Bug] [GOBBLIN-659] Ensure MultiHopFlowCompiler is properly initialized before attempting flow orchestration. +* [Bug] [GOBBLIN-654] Fix the argument order of JobStatusRetriever APIs to reflect actual usage. +* [Bug] [GOBBLIN-645] Fix some typos as reading thru code +* [Bug] [GOBBLIN-643] Fix NPE when closing KafkaExtractor +* [Bug] [GOBBLIN-593] fix NPE in task cancel +* [Bug] [GOBBLIN-571] Fix parquet schema for complex types + + GOBBLIN 0.14.0 -------------
