does c) really help? We'd still need a unique key on the existing PK columns - i.e. does Mysql just complain about the length/size of the PK index, but not other unique indexes?
Irrespective of mysql's quirks, we would almost never look up rows by this int-pk (we look up by dag+task+key, and often but not _awlays_ by execution_date as well), so I don't think we'd get any other benefits from it. -ash On Feb 27 2020, at 1:25 pm, Jarek Potiuk <[email protected]> wrote: > I think the main issue here is not performance but to make it works and be > compatible with 1.10. . > > And indeed if we go back for I'd index we would have to check the impact. > Was insert performance the original reason why it has changed ? Anyone > knows ? > > I am not sure what it is going to be but taking into account that those > inserts in xcom table are always individual insert (not bulks) and there is > likely more reads than writes - it might be either way easily. > > J.. > czw., 27 lut 2020, 13:55 użytkownik Kamil Breguła <[email protected]> > napisał: > > > Hello, > > Creating a sequence/AUTOINCREMENT will have a very large impact on the > > speed of adding rows. > > More information: https://docs.sqlalchemy.org/en/13/faq/performance.html > > In the near future I wanted to speed up the saving of items by using a > > low-level API. This will not work if we different PK. > > > > Best regards, > > Kamil > > > > On Thu, Feb 27, 2020 at 1:50 PM Jarek Potiuk <[email protected]> > > wrote: > > > > > > I am testing now 5.7 in 1.10 branch with the PR here: > > > https://github.com/apache/airflow/pull/7569 and it seems it's an easy > > > one one - seems it will "just work". > > > > > > However while testing it (I analysed a customer problem) I have found > > > that we have an issue in 2.0 with MySQL (not only 5.7 - it's the same > > > for 5.6 and even 8.0 (!). It boils down to an index (at least one) is > > > to big for the recommended uf8mb4 encoding: > > > > > > ALTER TABLE xcom ADD CONSTRAINT pk_xcom PRIMARY KEY (dag_id, task_id, > > > `key`, execution_date). > > > > > > The utf8mb4 encoding is needed to handle 4-byte UTF characters (mostly > > > emojis but also some Chinese characters). > > > > > > The question is what do we do with that. I think we have two options. > > > Not sure what was the story behind increasing the lengths (apparently > > > the problem does not > > > > > > a) decrease sizes of some of the fields (that would make backwards > > > incompatibility) > > > b) partial indexes of the fields (not really - this is primary key so > > > I guess it's not a good idea, although MYSQL supports indexing only > > > part of the string - we could cut key uniqueness at 200 chars for > > > example - but again, backwards incompatibility is a problem) > > > c) revert to id - generated primary kay and make secondary index > > > without the key - there are usually not many xcoms per task, so I > > > would not expect any problems - actually I would expect this will > > > speed up searches because the index will be much, much smaller (50% > > > more or less). I think this is the most viable option. > > > > > > Any other ideas? > > > I described all details in this issue: > > > https://issues.apache.org/jira/browse/AIRFLOW-6947 > > > > > > Here is similar story of somone who had similar issue > > > https://serversforhackers.com/c/mysql-utf8-and-indexing > > > > > > J. > > > > > > Jarek Potiuk > > > Polidea | Principal Software Engineer > > > > > > M: +48 660 796 129
