Hi, I hope I can contribute with more explanatory responses here.
> The first question is: Why does doctrine sort the commit order by classes > and not the actual entity graph it could generate by using the information of > UnitOfWork and `associationMappings`? When we created the CommitOrderCalculator, Roman and I debated about what would be most performatic to define commit order: 1- Consider purely known entities inside of UoW 2- Consider the known ClassMetadata The reasons on why we chose option 2 are very simple: - We are always considering a fixed set, not floating based on number of known entities - Large changesets (bulk operations for example) would take a critical amount of time to be processed. > However, after digging into Doctrine with my debugger, I saw Doctrine generates for > each entity a separate INSERT query, even if each identifier is known using sequences in PostgreSQL. We knew that from the start. We penalized bulk operations because most of the time we're dealing with auto-generated keys which are required to be mapped back to entities as their identifiers. We could, however, determine the execution method based on class metadata information (looking at its idGenerator type) and executing single or multi statements. Problems we found were all related to how we could get back all the IDs on a single statement. Remember that sequence based drivers may be facing a concurrency intense execution and numeric sequence may not be fully deterministic. I remember setting up an Oracle XE at home to do some tests around that. Other problem was the UoW complexity which would increase and degrade overall optimized performance. > I also discovered that when having a (optional) circular dependency between entities > Doctrine resolves those always with a 'INSERT without FK' value and later a > 'UPDATE SET FK=x' strategy, even when a circular dependency is not given in *this* > commit-round. (but of course it might happen in the next flush/commit round). CommitOrderCalculator is very loose on full round circular dependency. It does not break if visited node is already IN_PROGRESS. That is the subtle difference between https://github.com/doctrine/doctrine2/blob/master/lib/Doctrine/ORM/Internal/CommitOrderCalculator.php and https://github.com/doctrine/data-fixtures/blob/2.0/lib/Doctrine/Fixture/Sorter/TopologicalSorter.php Funny enough, you're the first ever to slightly mention that. Now, getting back to your point, because our CommitOrderCalculator is loose, we had to trick the executor enforcing extraUpdates inside of UoW. That was our decision based that most PHP app/developers never minded about full dependency breakdown and could potentially reduce adoption of Doctrine and also inumerous amount of bug reports. > This is actual the result of not using a entities graph to resolve dependencies but only > the associationMappings and simple topological sorting. Yes, that was the downside of choosing the deterministic amount of data to process versus non-deterministic. > Is there a concrete reason why Doctrine has chosen this particular way to resolve > relations/dependencies? I only wonder because this implementation does not > utilize maximum performance of various databases using bulk insert etc, but > rather suffers actually more from the overhead it generates. We really wanted to get back to bulk operations if possible, but as it stands right now, only assigned id generation type could benefit from them. Since it's the least used generator type, it not pay off the performance hit we may get by adding an extra decoupling layer. If you have any other questions, feel free to ask. PS: Sorry for my late reply. I'm quite overloaded lately and your thread almost got missed. Cheers, On Fri, Oct 17, 2014 at 1:01 PM, Marco Pivetta <[email protected]> wrote: > On 17 October 2014 00:04, Marc J. Schmidt <[email protected]> wrote: > >> >> Is there a concrete reason why Doctrine has chosen this particular way to >> resolve >> relations/dependencies? I only wonder because this implementation does not >> utilize maximum performance of various databases using bulk insert etc, >> but >> rather suffers actually more from the overhead it generates. >> > > In addition to what Benjamin already said, we also currently lack a > bulk-insert API (Steve Müller is working on it, but it won't hit 2.5). > > Marco Pivetta > > http://twitter.com/Ocramius > > http://ocramius.github.com/ > > -- > You received this message because you are subscribed to the Google Groups > "doctrine-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/doctrine-user. > For more options, visit https://groups.google.com/d/optout. > -- Guilherme Blanco MSN: [email protected] GTalk: guilhermeblanco Toronto - ON/Canada -- You received this message because you are subscribed to the Google Groups "doctrine-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/doctrine-user. For more options, visit https://groups.google.com/d/optout.
