potiuk commented on code in PR #24613:
URL: https://github.com/apache/airflow/pull/24613#discussion_r905834855


##########
airflow/migrations/versions/0113_2_4_0_add_dataset_model.py:
##########
@@ -0,0 +1,57 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Add Dataset model
+
+Revision ID: 0038cd0c28b4
+Revises: 424117c37d18
+Create Date: 2022-06-22 14:37:20.880672
+
+"""
+
+import sqlalchemy as sa
+from alembic import op
+from sqlalchemy import Integer, func
+
+from airflow.migrations.db_types import TIMESTAMP, StringID
+from airflow.utils.sqlalchemy import ExtendedJSON
+
+revision = '0038cd0c28b4'
+down_revision = '424117c37d18'
+branch_labels = None
+depends_on = None
+airflow_version = '2.4.0'
+
+
+def upgrade():
+    """Apply Add Dataset model"""
+    op.create_table(
+        'dataset',
+        sa.Column('id', Integer, primary_key=True, autoincrement=True),
+        sa.Column('uri', StringID(length=1000)),

Review Comment:
   One thing to note - we should be careful about limiting the size of indexes. 
If we want to make this field unique. it means unique index, if we use the 
utf8mb4 encoding for MySQL, this means that Length of a 1000 will be already 
to0 big - one characted in utf8mb4 is 4 bytes and index size limit is 3072 For 
both MySQL 5.7 and 8. We might want to change the enccoding to utf8mb3 instead 
(see how we limit sizes of other ID - we have a custom field type for that) but 
even there the maximum we can get is 1024 characters. 
   
   Another option would be a prefix index, but if want to make the field 
unique, then utf8mb3  and 1024 is the best we can do.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to