[
https://issues.apache.org/jira/browse/GOBBLIN-2087?focusedWorklogId=923263&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-923263
]
ASF GitHub Bot logged work on GOBBLIN-2087:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 13/Jun/24 09:04
Start Date: 13/Jun/24 09:04
Worklog Time Spent: 10m
Work Description: pawanbtej opened a new pull request, #3972:
URL: https://github.com/apache/gobblin/pull/3972
Addressed the limitation where the database name had to be extracted from
the pattern by adding support for an optional database name in the
DatasetHiveSchemaContainsNonOptionalUnion class.
Issue with the current approach:
- The database name had to be extracted from the dataset URN using a regex
pattern, which limited flexibility and could lead to errors if the URN format
changed.
Dear Gobblin maintainers,
Please accept this PR. I understand that it will not be reviewed until I
have checked off all the steps below!
### JIRA
- [ ] My PR addresses the following
[GOBBLIN-2087](https://issues.apache.org/jira/browse/GOBBLIN-2087) issues and
references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
- https://issues.apache.org/jira/browse/GOBBLIN-XXX
### Description
Changes made:
- Added a new property `OPTIONAL_DB_NAME` for specifying an optional
database name.
- Updated the constructor and methods to check for and use the optional
database name if provided.
- Added logging to indicate when the optional database name is used and
replaced the pattern-extracted database name.
- Ensured backward compatibility by retaining the existing behavior when the
optional database name is not provided.
These changes enhance the flexibility and usability of the
DatasetHiveSchemaContainsNonOptionalUnion class, allowing for more dynamic
database configurations and reducing dependency on the dataset URN format.
### Tests
- [ ] a new test case to verify the behavior with the optional database name:
- `testContainsNonOptionalUnionWithOptionalDbName`: Verifies that the
optional database name is correctly used and replaces the pattern-extracted
database name.
### Commits
- [ ] My commits all reference JIRA issues in their subject lines, and I
have squashed multiple commits if they address the same issue. In addition, my
commits follow the guidelines from "[How to write a good git commit
message](http://chris.beams.io/posts/git-commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not "adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"
Issue Time Tracking
-------------------
Worklog Id: (was: 923263)
Remaining Estimate: 0h
Time Spent: 10m
> Enhance DatasetHiveSchemaContainsNonOptionalUnion to Support Optional
> Database Name
> -----------------------------------------------------------------------------------
>
> Key: GOBBLIN-2087
> URL: https://issues.apache.org/jira/browse/GOBBLIN-2087
> Project: Apache Gobblin
> Issue Type: Improvement
> Reporter: pawan teja
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> **Summary:**
> The current implementation of the `DatasetHiveSchemaContainsNonOptionalUnion`
> class requires the database name to be extracted from the dataset URN using a
> regex pattern. This approach limits flexibility and can lead to errors if the
> URN format changes. To enhance the flexibility and usability of this class,
> we need to add support for an optional database name.
> **Current Issue:**
> - The database name must be extracted from the dataset URN using a regex
> pattern.
> - This dependency on the URN format limits flexibility and can lead to errors
> if the format changes.
> - Users cannot specify a database name directly, which could be more
> intuitive and flexible.
> **Proposed Solution:**
> - Introduce a new property `OPTIONAL_DB_NAME` in the
> `DatasetHiveSchemaContainsNonOptionalUnion` class.
> - Update the constructor and methods to check for the optional database name
> and use it if provided.
> - Add logging to indicate when the optional database name is used and when it
> replaces the pattern-extracted database name.
> - Ensure backward compatibility by retaining the existing behavior when the
> optional database name is not provided.
> **Acceptance Criteria:**
> - The `DatasetHiveSchemaContainsNonOptionalUnion` class should support an
> optional database name.
> - If the optional database name is provided, it should replace the database
> name extracted from the URN pattern.
> - The class should maintain its current functionality when the optional
> database name is not provided.
> - Appropriate logging should be added to indicate the use of the optional
> database name.
> - Tests should be added to verify the new functionality, including cases
> where the optional database name is and is not provided.
> These enhancements will improve the flexibility and usability of the
> `DatasetHiveSchemaContainsNonOptionalUnion` class, allowing for more dynamic
> database configurations and reducing dependency on the dataset URN format.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)