Taragolis commented on code in PR #38040:
URL: https://github.com/apache/airflow/pull/38040#discussion_r1520172677
##########
airflow/providers/google/cloud/transfers/postgres_to_gcs.py:
##########
@@ -112,7 +112,13 @@ def __init__(
self.cursor_itersize = cursor_itersize
def _unique_name(self):
- return f"{self.dag_id}__{self.task_id}__{uuid.uuid4()}" if
self.use_server_side_cursor else None
+ """
+ Generates a deterministic UUID for the cursor name,
+ using the combination of DAG ID and task ID.
+ """
+ if self.use_server_side_cursor:
+ return str(uuid5(uuid5(NAMESPACE_OID, self.dag_id), self.task_id))
Review Comment:
Maybe deterministic here is not a better choose, e.g. it will create the
same cursor name for the same task.
with different dagrun/map_id/try_number
So maybe random part here it required, and better sanitise non-random part
to the specific length
Some simple snippet
```python
from slugify import slugify
import uuid
def sample(dag_id, task_id):
separator = "__"
random_suffix = str(uuid.uuid4())
available_length = 63 - len(random_suffix) - len(separator) * 2
return separator.join(
(
slugify(dag_id, allow_unicode=False, max_length=available_length
// 2),
slugify(task_id, allow_unicode=False,
max_length=available_length // 2),
random_suffix
)
)
samples = [
("a" * 100, "b" * 100),
("a" * 10, "b" * 10),
("a" * 10, "b" * 10),
("dag", "task1"),
("Lorem ipsum dolor sit amet.", "Lorem ipsum dolor sit amet."),
]
for task_id, dag_id in samples:
result = sample(task_id, dag_id)
print(f"result={result}, length={len(result)}")
```
It is also require to add
[`slugify`](https://pypi.org/project/python-slugify/) as dependency [for Google
Provider](https://github.com/apache/airflow/blob/1e6140ba250a9a99b5f1b7d73aa689e89d4152ec/airflow/providers/google/provider.yaml#L89).
This is a core airflow dependency, but better to avoid the situation if it
removed from core dependency in the future.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]