2012/11/30 Nicolas <nicolas.demar...@gmail.com> > Le jeudi 29 novembre 2012 10:48:45 UTC+1, Sebastian Krebs a écrit : > >> >> >> >> 2012/11/29 Nicolas <nicolas....@gmail.com> >> >> *Hashmap*, finally after a strange wakeup in the middle of the night, i >>> researched how the array is implemented in php. I taked my last hour (or >>> two) to read wikipedia and the php sources (wikipedia, >>> <http://en.wikipedia.org/wiki/Hash_table> >>> zend_hash.h<https://github.com/php/php-src/blob/master/Zend/zend_hash.h#L261> >>> and >>> zend_hash.c<https://github.com/php/php-src/blob/master/Zend/zend_hash.c#L909> >>> ). >>> >>> If i see this >>> graph<http://upload.wikimedia.org/wikipedia/commons/7/7d/Hash_table_3_1_1_0_1_0_0_SP.svg>, >>> from wikipedia, the hash isn't use to find the keys, but to find the value >>> (the cost is O(1) at less to find the value *linked *to a key). >>> >>> If i see the code and particulary the following, i have the impress that >>> the hash table is a linked list pointer which iterate on a linked list >>> of item (call >>> bucket<https://github.com/php/php-src/blob/master/Zend/zend_hash.h#L54>with >>> key, value). If the cost was only O(1), the following function >>> zend_hash_index_**exists should just return value of a structure >>> indicated by a pointer, in this case the cost seems to be O(1) to O(n) >>> following the place of the key, for me *ulong h* is the hash : >>> >>> >>> ZEND_API int zend_hash_index_exists(const HashTable *ht, ulong h) >>> { >>> uint nIndex; >>> Bucket *p; >>> IS_CONSISTENT(ht); >>> nIndex = h & ht->nTableMask; >>> p = ht->arBuckets[nIndex]; >>> while (p != NULL) { >>> if ((p->h == h) && (p->nKeyLength == 0)) { >>> return 1; >>> } >>> >>> p = p->pNext; >>> } >>> >>> return 0; >>> } >>> >>> Finally i'm not completly sure how php implements each array, but it >>> seems to me that O(1) isn't right. And in fact even if a hash simplify the >>> search a hash can't return a memory place with a data structure or a >>> type by itself. It must be finded in a linked list before. >>> >> >> Why? :confused: A hashmap is a direct-memory structure. It's like "item >> you are looking for is at [offset] + [hash]%[maxBuckets]". This can be >> itself a pointer and I can directly jump to that address. There is >> absolutely no iteration required (and useful). >> > > After search all power of 2 superior at *32768 *cause many collisions > because all hash_key become indenticals, (hack > here<http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf>, > the correctif don't change behavior, just protect Globals). > > If you simplify all, forget the limit cases (hash collision, full > hashmap), you find O(1) as complexity, but i cannot prevent me to think > that collisions appear and the O(1), transforms itself in O(n*x) (with x as > collision rate as average case, and if you considere x as a constant... The > hash_code creation is too O(n) regarding n as the key length, it isn't > nothing. >
Do you see "p->pNext"? This means, that the result of a hash-lookup is not just a single item, but a bucket (what ht->arBuckets suggest, btw. It also suggests, that the bucket is an array). This means, there are no collision like the ones you think. A new item is appended to the array. If you want to lookup a single item, you create the hash of the key, then you fetch the bucket for that hash, than you lookup the item from that bucket with the full key. OK, here you have to use a search-operation (because we don't know the index of the item within the array), but this buckets are usually very small (like less then 5 items or such, depending on the size of the array). And regarding O-notation: O(1) * O(n<5) < O(1) * O(5) = O(1) > > Consequently to the collision, the first key in the map have a directly > access by hash, with more chance than the key at the end of the map. > Access to arrays/hashmaps are always direct. No exception (else it is not an array/hashmap ;)). > > I think always that a big array is a sensible data, which could be design > with attention, this was just my term here. > > Thanks to php to hide that :) But should we forget there pointers, >>> structures and others c behaviors? >>> >> >> We should, because we are not developing in C. Don't give them too much >> value, they are implementation details of the underlying engine and usually >> there a negliable against what your application _really_ does (especially >> IO). We could even take registers and such into account, but where should >> it end? >> > > I'm not completly agree, you should think about your underlying > dependencies, with references and registers as you say, but too with all > the libraries directly in your workflow : how curl works, if you make > http/ftp request, how database works (even with an > orm<http://docs.doctrine-project.org/en/2.1/reference/batch-processing.html> > )... > > And as the C is the php base, with the C too. > > A little *joke *test, view many times but surelly exact > https://gist.github.com/4176171, and showing flagrants differences if you > take time to test it (one to 4). And the response is in thinking about > pointer and affectation prefer than php variables. I agree that many others > thing are importants (efficiency, custumer satisfactions, tests, > reusability...), but we cannot trash all optimisation. > The problem with this test is, that this is an intended behaviour of the implementation. This means, that it doesn't care, wether it is implemented in C, Java, Python, or whatever, thus the underlying implementation remains uninteresting. I must admit, that there are cases, where it is interesting to know, how something works internally, but I my experience (I spend some time in my younger years about such topics too) there are quite rare and in most cases simply not worth it. > > >> >> >>> However, the autoloading take time and was clearly showed on my profiler >>> traces, and even if this seems not that symfony will win many speed, it >>> will loose nothing and *gain *in the same time a *little speed* and *clarity >>> with hermetic environnements.* >>> >> >> Be careful with such statements: It may loose stability by introducing >> complexity. The 1% is it definitely not worth. Be careful with micro >> optimizations! >> > > 1. I'm not sure hermetic and definable environnements in composer and > consequently in class map could be consider as a complexity (seems easy > coding), neither a micro optimization and this was the main code objective. > 2. Surelly but no code optimization is a worst case particularly *when > the computer speed don't growth and so many posts shoot symfony for its > speed. T*oo many in my mind, this is the reason which could me to begin a > discussion on speed. > Are we talking about Symfony1 or 2? Because I must say, that I don't see such an impact with 2 (at least under production with the related performance tuning like asset-dump, http-cache and such). Maybe I look at the wrong place? > > On this, good weekend, maybe at the next week. > Have a nice weekend :) A performance discussion could become interesting, but I would prefer to avoid micro-optimization discussion :X Regards, Sebastia > > >> >> >>> * >>> * >>> Le mercredi 28 novembre 2012 17:05:15 UTC+1, Sebastian Krebs a écrit : >>> >>>> As long as a classmap is a map, it's a hashmap and therefore the access >>>> is usually O(1), meaning: Doesn't care, where the corresponding class >>>> appears within the list. >>>> >>>> >>>> 2012/11/28 Nicolas <nicolas....@gmail.com> >>>> >>>>> Tests maded with apc, and even with apc the class isn't load if the >>>>> file isn't load, so apc don't avoid the autoload mecanismus. >>>>> >>>>> For the differences beetween dev and prod environnement, i think the >>>>> number of classes isn't the only point, the order is to a problem, as >>>>> symfony is at the end of the class map (the order is alphabetical), so >>>>> each >>>>> symfony class take more than 1500 comparisons strings to just load the >>>>> classes at every requests. >>>>> >>>>> Le mercredi 28 novembre 2012 06:46:55 UTC+1, Cameron Junge a écrit : >>>>> >>>>>> Were the tests performed with & without APC? APC will load the file >>>>>> straight from memory in a compiled form, so should help speed things up >>>>>> if >>>>>> the original tests were performed without it. Most production >>>>>> environments >>>>>> use a cache. >>>>>> >>>>>> Building an optimized classmap for each environment would probably >>>>>> only alter a small number of classes, so not sure what benefit there is >>>>>> in >>>>>> having multiple versions. For example, the difference between dev & >>>>>> production on one of my projects would probably be a few 10s or so >>>>>> classes... and some of them will be the automatic exchange of a debug >>>>>> class >>>>>> & the original class. >>>>>> >>>>>> Cameron >>>>>> >>>>>> On Saturday, November 24, 2012 5:15:44 PM UTC+13, Nicolas wrote: >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> As i play all this afternoon with xdebug, i could see that* the >>>>>>> class loader* take time and mainly *is called many times (~900 by >>>>>>> request)*. >>>>>>> >>>>>>> As i wanted* to improve the speed of a production environnement,* i >>>>>>> made tests and watch in details as the code is called. >>>>>>> >>>>>>> After take improvement give by the docs and finally by the command >>>>>>> >>>>>>>> php composer.phar dump-autoload --optimize >>>>>>> >>>>>>> >>>>>>> I* remove *from the file *vendor/composer/autoload_classmap.php* >>>>>>> produced *the tests classes* (252 on 2890, less than 1%) and run >>>>>>> tests with and without there changes. The result of the tests are >>>>>>> in a gist <https://gist.github.com/4137837>. The gain is a >>>>>>> little more than one percent on a complex page and less significant on a >>>>>>> hello world page. >>>>>>> >>>>>>> My personnals conclusions are the array class's map should be >>>>>>> optimised. Solutions could be than : >>>>>>> >>>>>>> - A *arbitrary* *environnements *should be defined* in composer*to >>>>>>> correspond to applications and produce specific class_loader files >>>>>>> purposing *to reduce the array size* >>>>>>> >>>>>>> >>>>>>> - The* order of calling frequencies* should impose* t*he *order >>>>>>> in composer requirements* which should save it in *the order of >>>>>>> the classes's map *file (vendor/composer/autoload_[**env**** >>>>>>> ]_classmap.php)*, to reduce the search time* >>>>>>> >>>>>>> I have begin to post there solutions in a composer issue created by >>>>>>> Seldaek few month ago. But as i maded tests on symfony, as i have a >>>>>>> little voice in the symfony community, and i think you could think >>>>>>> better, >>>>>>> and certainly produce smarter code than me about there behaviors with >>>>>>> better acuity (mostly as i see the time i spend to make this mail >>>>>>> and so... and the hour it is). >>>>>>> >>>>>>> So good night, or have nice day, >>>>>>> >>>>>>> Best Regards, >>>>>>> >>>>>>> Nicola >>>>>>> >>>>>>> PS: Very thanks for all you works, symfony and composer and all the >>>>>>> library become more and more nice to use. >>>>>>> >>>>>> -- >>>>> If you want to report a vulnerability issue on symfony, please send it >>>>> to security at symfony-project.com >>>>> >>>>> You received this message because you are subscribed to the Google >>>>> Groups "symfony developers" group. >>>>> To post to this group, send email to symfon...@googlegroups.com >>>>> >>>>> To unsubscribe from this group, send email to >>>>> symfony-devs...@**googlegroups.**com >>>>> >>>>> For more options, visit this group at >>>>> http://groups.google.com/**group**/symfony-devs?hl=en<http://groups.google.com/group/symfony-devs?hl=en> >>>>> >>>> >>>> >>>> >>>> -- >>>> github.com/KingCrunch >>>> >>>> -- >>> If you want to report a vulnerability issue on symfony, please send it >>> to security at symfony-project.com >>> >>> You received this message because you are subscribed to the Google >>> Groups "symfony developers" group. >>> To post to this group, send email to symfon...@googlegroups.com >>> To unsubscribe from this group, send email to >>> symfony-devs...@**googlegroups.com >>> For more options, visit this group at >>> http://groups.google.com/**group/symfony-devs?hl=en<http://groups.google.com/group/symfony-devs?hl=en> >>> >> >> >> >> -- >> github.com/KingCrunch >> >> -- > If you want to report a vulnerability issue on symfony, please send it to > security at symfony-project.com > > You received this message because you are subscribed to the Google > Groups "symfony developers" group. > To post to this group, send email to symfony-devs@googlegroups.com > To unsubscribe from this group, send email to > symfony-devs+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/symfony-devs?hl=en > -- github.com/KingCrunch -- If you want to report a vulnerability issue on symfony, please send it to security at symfony-project.com You received this message because you are subscribed to the Google Groups "symfony developers" group. To post to this group, send email to symfony-devs@googlegroups.com To unsubscribe from this group, send email to symfony-devs+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/symfony-devs?hl=en