Hi Marcus, May be it will work. I think we should make a test and measure the speed difference.
My suggestion is test bucket with first field "key" (having "h" as first) and, first field "pData". The only real disadvantage is 4-byte difference. :) Alignment is not a real problem. Buckets are always aligned by C compilers or (e)malloc(). Thanks. Dmitry. > -----Original Message----- > From: Marcus Boerger [mailto:[EMAIL PROTECTED] > Sent: Monday, March 13, 2006 12:02 PM > To: Dmitry Stogov > Cc: php-i18n@lists.php.net > Subject: Re: [PHP-I18N] Hash api change > > > Hello Dmitry, > > Monday, March 13, 2006, 7:43:05 AM, you wrote: > > > Hi Marcus, > > > I agree that we can allocate Bucket and string in one block. But we > > will need to initialize pointer in Bucket.key.zstr and then > > dereference it every time we access string part of key. This will a > > slowdown. > > If is a simple add and store operation. This usually tracks > down to three assembler ops. That is much faster than having > to create a new struct for each apply function's callback of > which we have a few. > > > Of course we can access it directly as arKey, but for what > reason we > > need this change? > > I didn't say that hash internal operations should use it. But > the apply stuff can simply bypass the adress of the > zend_hash_key if we go with that layout. > > > Also each bucket will eat more RAM. > > wow 4 bytes. > > > One more thing: having "h" as first field of Bucket allows faster > > access to "h" or more compact code as C compiler doesn't need add > > offset. > > On an X86 this is only a single byte more in the assembler opcodes. > > > I didn't understand what you mean in "not alignable memory block in > > the integer index case". > > In the current struct we force the string container to hold > at least one byte even if we don't use it. that means that > three bytes are unused. If the compiler doesn't automatically > align the data the next block is unaligned. Also if not > forcing to use any byte you get back the 4 bytes i added in > the above in case of interger indices. > > best regards > marcus > > >> -----Original Message----- > >> From: Marcus Boerger [mailto:[EMAIL PROTECTED] > >> Sent: Sunday, March 12, 2006 1:59 PM > >> To: Dmitry Stogov > >> Cc: php-i18n@lists.php.net > >> Subject: Re: [PHP-I18N] Hash api change > >> > >> > >> Hello Dmitry, > >> > >> you are right about the strign allocation, we have this in > >> 4 and 5 as well. But a minor change would even captutre that too: > >> > >> typedef struct bucket { > >> ulong h; /* Used for numeric indexing */ > >> uint nKeyLength; > >> void *pData; > >> void *pDataPtr; > >> struct bucket *pListNext; > >> struct bucket *pListLast; > >> struct bucket *pNext; > >> struct bucket *pLast; > >> HashKey key; > >> union { > >> char s[]; /* Must be last element */ > >> UChar u[]; /* Must be last element */ > >> } arKey; > >> } > >> > >> Also you had that member being [1] though the language allows > >> to have it be an incomplete type ([]). I prefer the latter > >> becuase that does not force to allocate a not alignable > >> memory block in the integer index case. If we preceed this > >> way there is no need to allocate two blocks and instead we > >> can easily fill in the pointer for the string key by a simple > >> addition: > >> > >> // for the integer index case > >> bucket p = pemalloc(sizeof(bucket), persistent); > >> > >> // for the string index case > >> bucket p = pemalloc(sizeof(bucket)+key_len, persistent); > >> p->key.arKey.s = p + offsetof(p, arKey); > >> memmove(p->key.s, key_value, key_len); > >> > >> My opinion here is anyway that having to deal with only one > >> struct is much easier and also faster. Being able to change > >> the malloc to a single pointer add operation is a nice addon. > >> > >> best regards > >> marcus > >> > >> Sunday, March 12, 2006, 11:11:32 AM, you wrote: > >> > >> > Hi Marcus, > >> > >> > The idea make sense, but note that this modification > will need two > >> > malloc() for each bucket with string index. First for > bucket itself > >> > and the second for "arKey". (now they are done in one call) > >> > Also we will lose some performance in dereferencing pointer > >> to arKey in each > >> > ZendHash operation. > >> > >> > Thanks. Dmitry. > >> > >> > >> >> -----Original Message----- > >> >> From: Marcus Boerger [mailto:[EMAIL PROTECTED] > >> >> Sent: Saturday, March 11, 2006 3:22 PM > >> >> To: php-i18n@lists.php.net > >> >> Subject: [PHP-I18N] Hash api change > >> >> > >> >> > >> >> Hello php-i18n, > >> >> > >> >> i think we should change the api a bit: > >> >> > >> >> From: > >> >> > >> >> typedef struct _key { > >> >> zend_uchar type; > >> >> union { > >> >> char s[1]; /* Must be last element */ > >> >> UChar u[1]; /* Must be last element */ > >> >> } arKey; > >> >> } HashKey; > >> >> > >> >> typedef struct bucket { > >> >> ulong h; > >> >> /* Used for numeric indexing */ > >> >> uint nKeyLength; > >> >> void *pData; > >> >> void *pDataPtr; > >> >> struct bucket *pListNext; > >> >> struct bucket *pListLast; > >> >> struct bucket *pNext; > >> >> struct bucket *pLast; > >> >> HashKey key; /* Must be last element */ > >> >> } Bucket; > >> >> > >> >> To: > >> >> > >> >> typedef struct _key { > >> >> ulong h; /* Used for numeric indexing */ > >> >> uint nKeyLength; > >> >> zend_uchar type; > >> >> zstr arKey; /* Must be last element */ > >> >> } HashKey; > >> >> > >> >> typedef struct bucket { > >> >> void *pData; > >> >> void *pDataPtr; > >> >> struct bucket *pListNext; > >> >> struct bucket *pListLast; > >> >> struct bucket *pNext; > >> >> struct bucket *pLast; > >> >> HashKey key; /* Must be last element */ > >> >> } Bucket; > >> >> > >> >> So now HashKey matches zend_hash_key just by pure reordering. > >> >> > >> >> Also we should probably change the apply_with_arguments > stuff to > >> >> pass the tsrm key. That is we'd change from: > >> >> > >> >> typedef int (*apply_func_args_t)(void *pDest, int num_args, > >> >> va_list args, zend_hash_key *hash_key); ZEND_API void > >> >> zend_hash_apply_with_arguments(HashTable *ht, apply_func_args_t > >> >> apply_func, int, ...); > >> >> > >> >> > >> >> To: > >> >> > >> >> typedef int (*apply_func_args_t)(void *pDest TSRMLS_DC, int > >> >> num_args, va_list args, zend_hash_key *hash_key); ZEND_API void > >> >> zend_hash_apply_with_arguments(HashTable *ht TSRMLS_DC, > >> >> apply_func_args_t apply_func, int, ...); > >> >> > >> >> Further more our add_assoc functions currently only > allow handling > >> >> of native string indexes. This should be changed to > allow unicode > >> >> indexes as well: > >> >> > >> >> ZEND_API int add_assoc_long_ex(zval *arg, char *key, > uint key_len, > >> >> long n); > >> >> > >> >> TO: > >> >> > >> >> ZEND_API int add_assoc_long_ex(zval *arg, zend_uchar type, zst > >> >> *key, uint key_len, long n); > >> >> > >> >> Though maybe it is considerable to just add more > functions i don't > >> >> think flooding the api is a good idea and prefer to > only have one > >> >> version that can easily deal with both. > > > > -- PHP Unicode & I18N Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php