RE: [PHP-I18N] Hash api change

Dmitry Stogov Mon, 13 Mar 2006 01:54:28 -0800

Hi Marcus,

May be it will work.
I think we should make a test and measure the speed difference.


My suggestion is test bucket with first field "key" (having "h" as first)
and, first field "pData".

The only real disadvantage is 4-byte difference. :)

Alignment is not a real problem. Buckets are always aligned by C compilers
or (e)malloc().

Thanks. Dmitry.

> -----Original Message-----
> From: Marcus Boerger [mailto:[EMAIL PROTECTED] 
> Sent: Monday, March 13, 2006 12:02 PM
> To: Dmitry Stogov
> Cc: php-i18n@lists.php.net
> Subject: Re: [PHP-I18N] Hash api change
> 
> 
> Hello Dmitry,
> 
> Monday, March 13, 2006, 7:43:05 AM, you wrote:
> 
> > Hi Marcus,
> 
> > I agree that we can allocate Bucket and string in one block. But we 
> > will need to initialize pointer in Bucket.key.zstr and then 
> > dereference it every time we access string part of key. This will a 
> > slowdown.
> 
> If is a simple add and store operation. This usually tracks 
> down to three assembler ops. That is much faster than having 
> to create a new struct for each apply function's callback of 
> which we have a few.
> 
> > Of course we can access it directly as arKey, but for what 
> reason we 
> > need this change?
> 
> I didn't say that hash internal operations should use it. But 
> the apply stuff can simply bypass the adress of the 
> zend_hash_key if we go with that layout.
> 
> > Also each bucket will eat more RAM.
> 
> wow 4 bytes.
> 
> > One more thing: having "h" as first field of Bucket allows faster 
> > access to "h" or more compact code as C compiler doesn't need add 
> > offset.
> 
> On an X86 this is only a single byte more in the assembler opcodes.
> 
> > I didn't understand what you mean in "not alignable memory block in 
> > the integer index case".
> 
> In the current struct we force the string container to hold 
> at least one byte even if we don't use it. that means that 
> three bytes are unused. If the compiler doesn't automatically 
> align the data the next block is unaligned. Also if not 
> forcing to use any byte you get back the 4 bytes i added in 
> the above in case of interger indices.
> 
> best regards
> marcus
> 
> >> -----Original Message-----
> >> From: Marcus Boerger [mailto:[EMAIL PROTECTED]
> >> Sent: Sunday, March 12, 2006 1:59 PM
> >> To: Dmitry Stogov
> >> Cc: php-i18n@lists.php.net
> >> Subject: Re: [PHP-I18N] Hash api change
> >> 
> >> 
> >> Hello Dmitry,
> >> 
> >>   you are right about the strign allocation, we have this in
> >> 4 and 5 as well. But a minor change would even captutre that too:
> >> 
> >> typedef struct bucket {
> >>         ulong h;        /* Used for numeric indexing */
> >>         uint nKeyLength;
> >>         void *pData;
> >>         void *pDataPtr;
> >>         struct bucket *pListNext;
> >>         struct bucket *pListLast;
> >>         struct bucket *pNext;
> >>         struct bucket *pLast;
> >>         HashKey key;
> >>         union {
> >>                 char  s[];   /* Must be last element */
> >>                 UChar u[];   /* Must be last element */
> >>         } arKey;
> >> }
> >> 
> >> Also you had that member being [1] though the language allows
> >> to have it be an incomplete type ([]). I prefer the latter 
> >> becuase that does not force to allocate a not alignable 
> >> memory block in the integer index case. If we preceed this 
> >> way there is no need to allocate two blocks and instead we 
> >> can easily fill in the pointer for the string key by a simple 
> >> addition:
> >> 
> >> // for the integer index case
> >> bucket p = pemalloc(sizeof(bucket), persistent);
> >> 
> >> // for the string index case
> >> bucket p = pemalloc(sizeof(bucket)+key_len, persistent);
> >> p->key.arKey.s = p + offsetof(p, arKey);
> >> memmove(p->key.s, key_value, key_len);
> >> 
> >> My opinion here is anyway that having to deal with only one
> >> struct is much easier and also faster. Being able to change 
> >> the malloc to a single pointer add operation is a nice addon.
> >> 
> >> best regards
> >> marcus
> >> 
> >> Sunday, March 12, 2006, 11:11:32 AM, you wrote:
> >> 
> >> > Hi Marcus,
> >> 
> >> > The idea make sense, but note that this modification 
> will need two
> >> > malloc() for each bucket with string index. First for 
> bucket itself 
> >> > and the second for "arKey". (now they are done in one call)
> >> > Also we will lose some performance in dereferencing pointer 
> >> to arKey in each
> >> > ZendHash operation.
> >> 
> >> > Thanks. Dmitry.
> >> 
> >> 
> >> >> -----Original Message-----
> >> >> From: Marcus Boerger [mailto:[EMAIL PROTECTED]
> >> >> Sent: Saturday, March 11, 2006 3:22 PM
> >> >> To: php-i18n@lists.php.net
> >> >> Subject: [PHP-I18N] Hash api change
> >> >> 
> >> >> 
> >> >> Hello php-i18n,
> >> >> 
> >> >>   i think we should change the api a bit:
> >> >> 
> >> >> From:
> >> >> 
> >> >> typedef struct _key {
> >> >>         zend_uchar type;
> >> >>         union {
> >> >>                 char  s[1];   /* Must be last element */
> >> >>                 UChar u[1];   /* Must be last element */
> >> >>         } arKey;
> >> >> } HashKey;
> >> >> 
> >> >> typedef struct bucket {
> >> >>         ulong h;                                              
> >> >>   /* Used for numeric indexing */
> >> >>         uint nKeyLength;
> >> >>         void *pData;
> >> >>         void *pDataPtr;
> >> >>         struct bucket *pListNext;
> >> >>         struct bucket *pListLast;
> >> >>         struct bucket *pNext;
> >> >>         struct bucket *pLast;
> >> >>         HashKey key; /* Must be last element */
> >> >> } Bucket;
> >> >> 
> >> >> To:
> >> >> 
> >> >> typedef struct _key {
> >> >>         ulong h;     /* Used for numeric indexing */
> >> >>         uint nKeyLength;
> >> >>         zend_uchar type;
> >> >>         zstr arKey;   /* Must be last element */
> >> >> } HashKey;
> >> >> 
> >> >> typedef struct bucket {
> >> >>         void *pData;
> >> >>         void *pDataPtr;
> >> >>         struct bucket *pListNext;
> >> >>         struct bucket *pListLast;
> >> >>         struct bucket *pNext;
> >> >>         struct bucket *pLast;
> >> >>         HashKey key; /* Must be last element */
> >> >> } Bucket;
> >> >> 
> >> >> So now HashKey matches zend_hash_key just by pure reordering.
> >> >> 
> >> >> Also we should probably change the apply_with_arguments 
> stuff to 
> >> >> pass the tsrm key. That is we'd change from:
> >> >> 
> >> >> typedef int (*apply_func_args_t)(void *pDest, int num_args, 
> >> >> va_list args, zend_hash_key *hash_key); ZEND_API void 
> >> >> zend_hash_apply_with_arguments(HashTable *ht, apply_func_args_t 
> >> >> apply_func, int, ...);
> >> >> 
> >> >> 
> >> >> To:
> >> >> 
> >> >> typedef int (*apply_func_args_t)(void *pDest TSRMLS_DC, int 
> >> >> num_args, va_list args, zend_hash_key *hash_key); ZEND_API void 
> >> >> zend_hash_apply_with_arguments(HashTable *ht TSRMLS_DC, 
> >> >> apply_func_args_t apply_func, int, ...);
> >> >> 
> >> >> Further more our add_assoc functions currently only 
> allow handling 
> >> >> of native string indexes. This should be changed to 
> allow unicode 
> >> >> indexes as well:
> >> >> 
> >> >> ZEND_API int add_assoc_long_ex(zval *arg, char *key, 
> uint key_len, 
> >> >> long n);
> >> >> 
> >> >> TO:
> >> >> 
> >> >> ZEND_API int add_assoc_long_ex(zval *arg, zend_uchar type, zst 
> >> >> *key, uint key_len, long n);
> >> >> 
> >> >> Though maybe it is considerable to just add more 
> functions i don't 
> >> >> think flooding the api is a good idea and prefer to 
> only have one 
> >> >> version that can easily deal with both.
> 
> 
> 
> 

-- 
PHP Unicode & I18N Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-I18N] Hash api change

Reply via email to