does c) really help? We'd still need a unique key on the existing PK columns - 
i.e. does Mysql just complain about the length/size of the PK index, but not 
other unique indexes?

Irrespective of mysql's quirks, we would almost never look up rows by this 
int-pk (we look up by dag+task+key, and often but not _awlays_ by 
execution_date as well), so I don't think we'd get any other benefits from it.
-ash
On Feb 27 2020, at 1:25 pm, Jarek Potiuk <[email protected]> wrote:
> I think the main issue here is not performance but to make it works and be
> compatible with 1.10. .
>
> And indeed if we go back for I'd index we would have to check the impact.
> Was insert performance the original reason why it has changed ? Anyone
> knows ?
>
> I am not sure what it is going to be but taking into account that those
> inserts in xcom table are always individual insert (not bulks) and there is
> likely more reads than writes - it might be either way easily.
>
> J..
> czw., 27 lut 2020, 13:55 użytkownik Kamil Breguła <[email protected]>
> napisał:
>
> > Hello,
> > Creating a sequence/AUTOINCREMENT will have a very large impact on the
> > speed of adding rows.
> > More information: https://docs.sqlalchemy.org/en/13/faq/performance.html
> > In the near future I wanted to speed up the saving of items by using a
> > low-level API. This will not work if we different PK.
> >
> > Best regards,
> > Kamil
> >
> > On Thu, Feb 27, 2020 at 1:50 PM Jarek Potiuk <[email protected]>
> > wrote:
> > >
> > > I am testing now 5.7 in 1.10 branch with the PR here:
> > > https://github.com/apache/airflow/pull/7569 and it seems it's an easy
> > > one one - seems it will "just work".
> > >
> > > However while testing it (I analysed a customer problem) I have found
> > > that we have an issue in 2.0 with MySQL (not only 5.7 - it's the same
> > > for 5.6 and even 8.0 (!). It boils down to an index (at least one) is
> > > to big for the recommended uf8mb4 encoding:
> > >
> > > ALTER TABLE xcom ADD CONSTRAINT pk_xcom PRIMARY KEY (dag_id, task_id,
> > > `key`, execution_date).
> > >
> > > The utf8mb4 encoding is needed to handle 4-byte UTF characters (mostly
> > > emojis but also some Chinese characters).
> > >
> > > The question is what do we do with that. I think we have two options.
> > > Not sure what was the story behind increasing the lengths (apparently
> > > the problem does not
> > >
> > > a) decrease sizes of some of the fields (that would make backwards
> > > incompatibility)
> > > b) partial indexes of the fields (not really - this is primary key so
> > > I guess it's not a good idea, although MYSQL supports indexing only
> > > part of the string - we could cut key uniqueness at 200 chars for
> > > example - but again, backwards incompatibility is a problem)
> > > c) revert to id - generated primary kay and make secondary index
> > > without the key - there are usually not many xcoms per task, so I
> > > would not expect any problems - actually I would expect this will
> > > speed up searches because the index will be much, much smaller (50%
> > > more or less). I think this is the most viable option.
> > >
> > > Any other ideas?
> > > I described all details in this issue:
> > > https://issues.apache.org/jira/browse/AIRFLOW-6947
> > >
> > > Here is similar story of somone who had similar issue
> > > https://serversforhackers.com/c/mysql-utf8-and-indexing
> > >
> > > J.
> > >
> > > Jarek Potiuk
> > > Polidea | Principal Software Engineer
> > >
> > > M: +48 660 796 129

Reply via email to