Re: [Fish-users] Re: A suggestion for implementing hashes in Fish

Axel Liljencrantz Wed, 04 Jan 2006 08:45:17 -0800

Sorry for not partaking in this discussion sooner...

On 1/3/06, Netocrat < [EMAIL PROTECTED]> wrote:

John Brock wrote:

[...]

> In Perl if you assign a hash to an array you get alternating keys
> and values, which is where I got this idea.  The additional thought
> that occurred to me was that there was really no reason why you
> needed two data types with two namespaces -- that in fact you could
> use a single data type both ways.  This just seems simpler to me.
> Why multiply entities if you don't have to?

Exactly.  And the PHP-equivalent alternative avoids multiple entities.
It's simpler than Perl's hash/array duality.  Your proposal retains
multiple entities - a hash (using {}) and an array (using []).
Regardless that /internally/ you propose these be stored in a common
format, from the user's perspective they are mappings between two
different data types.  I've pointed out that your hash model could be
used identically to our current proposal, meaning that your hash=>array
mapping index operator (using [], although for backwards compatibility
the meaning of your [] and {} operators would need to be reversed) is an
"optional extra", but you don't seem to recognise it.

That is indeed the main advantage of the [] map syntax. And it is a very important one, what with my obsession with small syntaxes. :)

On the other hand, the {} map syntax also has some advantages.

* No potentially weird/unintuitive corner cases when creating an array and converting it to a map
* No issues with exporting variables to other programs
* Easy to implement
* Easy to list the keys 'echo $foo[(seq 1 2 (count $foo))]' and values 'echo $foo[(seq 2 2 (count $foo))]'. This could even be implemented in a shellscript function, so you'd just use 'keys $foo' or 'values $foo'.

I think both syntaxes have merit, and I'd be very happy to see more replys about which syntax other people prefer.

> I don't see how doing this is redundant -- it's just an alternative

Because the main new operation required on a hash is "get keys". Your
scheme allows this, but rather than "get keys", it only supports "get
keys with values interspersed". Yes, you could take care of this using
odd-only indexing, but why complicate code that doesn't need to be
complicated?

I'm not sure I see the huge advantage for the [] map syntax here. 'echo $foo' prints all the keys, and 'echo $foo[$foo]' prints all the values. It works, but the latter, while cool, seems somewhat unintuitive. The {} map solution (IMO) just as easy. (See above)

> way of looking at a list -- and in particular I don't see how you
> lose backwards compatibility with the original meaning of an array.
> Anything you could do before with a variable you can still do --
> it's just that you've just been given some new syntax that lets
> you do some new things with a variable that you couldn't do before
> (i.e., look up the values of even elements based on the values of
> odd elements). In practice a user would normally use a particular
> variable one way or the other,

Right, and that's my point - it's an infrequently used mapping that
doesn't require operator-level support.

I guess the question is if the language becomes larger/more bloated by adding a new operator instead of overloading an existing one.

> but all operations on any variable
> would always make sense and yield reasonably reasonable results.
> (Note that if you wanted to iterate through all the keys or values
> you could just increment your loop counter by 2 instead of 1 --
> although maybe you would also want special functions for extracting
> all the keys or values, again like Perl). Some things that one

Those special functions are frequent enough to warrant an operator,
whereas the mapping you propose is infrequently used enough to be
implemented as a function instead.

Hadn't really though of that option before. That would remove the need for map functionality from the shell alltogether. But the syntax would be significantly less compact. I'll have to think about how this would look in real code.

> might reasonably do with an array -- such as deleting a single
> element -- would totally redefine the hash lookup.  But as the joke
> goes, if it hurts when you do that, then don't do that!
>
> I'm less optimistic about one thing though.  Originally I thought
> that the hash lookup could at first be implemented by just searching
> an array from beginning to end for a key (which would be a fast
> and easy thing to do), and then at some later date something more
> sophisticated could be done by attaching indexes to a variable the
> first time it was used as a hash, without changing the semantics
> (i.e., the array values would not be scrambled when you used a
> variable as a hash).  I assumed there was some way to do this was
> at least reasonably efficient, even if you didn't end up with the
> the most highly optimized hashes on the market.  But I'm not so
> sure any more, and I suspect that if you want even moderately
> efficient hashes you are going to have to accept that using a
> variable as a hash scrambles the array in some way.  Logically this
> isn't necessary, but in practice it may be, and it detracts from
> the prettiness of the scheme.

I noticed something similar but didn't want to focus on it in my last
response.  Assuming that an index is used, and that the structure is
initially treated as a hash, then when mapping from hash to array, how
is order determined?  Alphabetically?  By time-of-insertion?  By
internal index order?  The answers can be specified, but they're not
necessarily going to be intuitive.

If more speed is desired, then the order must be either arbitrary(hash algorithm) or Alphabetically(Binary search algorithm). The latter should be a reasonable tradeof between inuitivity and speed. It should be possible to use a redblack tree, giving logarithmic performance on all operations. But the initial implementation would probably use time-of-insertion since that is the easiest for a naive implementation.

> I still think it's a good idea
> though; it gives you all the power of Perl hashes, with only one
> variable type and namespace.
>
> Or almost all the power. Does fish support sparse arrays? I.e.,
> can you:
>
> set a[1] Hello
> set a[1000] Goodbye
>
> and then use a[500] and a[1500] (returning the value "")?

Did you try it?

Yes, it does, but the missing empty elements are space-separated on
expansion. Last I looked at the source, fish would internally be
storing 999 ARRAY_SEP characters for the missing elements.

The 'missing' elements become empty elements, i.e. each one of them contains the empty string.

>>[...]
>>
>>>>The advantage to all this is that you don't have to introduce a
>>>>new data type for hashes,
>
>>PHP supports a similar data type to that previously discussed on this
>>list - an array that may be indexed traditionally (numeric keys) or as a
>>hash aka map (string keys) using the same indexing operators [].  This
>>seems to be compatible with the model John suggests for the hash
>>indexing operators {}.
>
> But what if you want to use "1" as a hash key?  Either you need

String "1" is equivalent to integer 1 (in a shell the difference is not
very pronounced anyhow); both are a key that maps to a value.  If
treating the hash as an array, conceptually that key is the first
element, but practically (internal representation) it could be located
anywhere.  Where's the problem?

The problem that I can see is all the potential inconsistency when silently moving from array to map.

>#create array
>set arr foo bar baz
>echo $arr
foo bar baz
>#Use it as a map
>set arr[smurf] blue
>echo $arr
1 2 3 smurf

Not 100% intuitive, IMO.

> some kind of new systax, or you have to accept that some strings
> cannot be used as hash keys. I think the former is preferable.

--
http://members.dodo.com.au/~netocrat

--
Axel

Re: [Fish-users] Re: A suggestion for implementing hashes in Fish

Reply via email to