Re: Map type

Torben Hoffmann Tue, 15 Mar 2011 07:30:56 -0700

On Tue, Mar 15, 2011 at 14:02, Eric Merritt <[email protected]> wrote:


> Hey Torben,
>
> On Tue, Mar 15, 2011 at 10:40:44AM +0100, Torben Hoffmann wrote:
> >    Hi,
> >
> >    First: great initiative, but not a small one...
>
> I know. Unfortunately, it seems that no one else is doing it and I
> have a need so ....
>
> hopefully we can get it right and it will be useful to others.
>

That would be very great indeed!


>
>
>
> >    I doubt that parameterized modules will be the right solution to this
> >    problem.
> >    Note: I have nothing against parameterized modules, I just don't think
> >    they are the right tool for this job.
>
> Lets talk it out. I wrestled between two different approaches and have
> implemented them both. One was somewhat similar to what you describe
> here.
>
> >
> >    Your problem: common interface to the key-value stores provided by
> Erlang
> >    so that you can easily change the implementation by changing the
> >    initialisation to use a different background module.
>
>
> Yup. The existing ones dictionary like modules are going to have to be
> wrapped. So you will end up something along the lines of
>
>  ec_dictionary.erl (behaviour definition)
>  ec_gb_trees.erl (wrapper for gb_tress that implements the
>  ec_dictionary behavior)
>  ec_dict.erl (wrapper for dict that implements the ec_dictionary
>  behaviour)
>
>  ...
>
>  In this model, you will have the cost of an extra function call for
>  each of these wrapped modules, though I hope that in the future more
>  native implementations occur.  So for say gb_tress and gb_dict you
>  would end up using it like.
>
>  Dictionary1 = ec_gb_trees:new().
>  Dictionary2 = ec_dict:new().
>
>  NewDictionary1 = Dictionary1:put(SomeKey, SomeValue).
>  NewDictionary2 = Dictionary2:put(SomeKey, SomeValue).
>

You have lost me a bit here... how would the Dictionary1 hold both a module
name and a value?


>
> In this case it wouldn't matter what the concrete implementation of
> dictionary is, as long as it implemented the ec_dictionary interface.
>
> I have some utilities that check that a particular module implements
> an interface, but in general you just have to trust that that the
> caller is passing you the right value. Of course, thats pretty normal
> in erlang.
>

Trust it at runtime but test the hell out of it and run dialyzer like crazy.


>
> >
> >    This is what OO dudes do with inheritance. And that can be pretty cool
> >    when done by a good OO-programmer.
>
> Its not actually inheritance and has little to do with OO.  its the
> ability to define an interface for a class of types. Haskell has it,
> lisp has it, etc.  We actually have inheritance in erlang (similarly
> unsupported as parametrized types), so if we need it we can use it.
> Well perhaps it is some limited form of inheritance though in this
> model no actual code is shared. Its simply the interface that is
> defined and coded to.
>
> >
> >    Parameterized modules is more like a poor-mans implementation of
> functors
> >    in SML. Functors are cool, but in practice you tend to avoid the
> hazzle...
> >    at least I do.
>
> It might not be a focus of the parametrized modules but they do a
> very good job of abstracting the module that is actually called and
> thats the mode I want to use them in here. I am torn by this. It has
> the good quality that it makes the module being called
> abstract. Unfortunately this hides the module being called (a nice
> little conundrum), obscuring what code is actually being
> executed. That is, it reduces the apparent locality of the code. This
> is a bad thing, but all of the other approaches except the one you
> detail here have a similar problems.
>

Well, isn't that a matter of how many layers of indirection you want?

My approach adds the thinnest indirection layer by having everything in the
same module.
I like that sort of thing since it makes debugging extremely easy.
Another bonus of coding it so directly as I did is that if you run xref on
the module it will actually figure out what is going on! I have done some
code where the module was a parameter and that puts a real stick in the
wheel of xref.

Another thing is dialyzer - I am not quite sure that it handles
parameterized modules very well, so I would experiment with that before
playing more with those things...


>
>
> >    If you have a bunch of modules that all have the same API then it
> makes
> >    sense to use the parameterized module.
> >
> >    In this case you are facing a wide variety of APIs (this has been on
> the
> >    eq mailing list numerous times) so you would gain next to nothing by
> using
> >    parameterized modules since you would have to add exceptions to the
> code
> >    depending on what base module you are using.
> >
> >    Since the code has to deal with different APIs I would make a regular
> >    module called gen_dict (or gen_map or ...) with the following API
> (fill
> >    out the blanks...):
> >
> >    -opaque gen_map() :: #my_map{}.
> >
> >    -spec new(base_module()) -> gen_map().
> >    new(dict) -> #my_map{base=dict,content=dict:new()};
> >    new(lists) -> #my_map{base=lists,content=[]}.
> >
> >    -spec size(gen_map()) -> non_neg_integer().
> >    size(#my_map{base=dict,content=C}) -> dict:size(C);
> >    size(#my_map{base=lists,content=C}) -> length(C).
> >
> >    And I would probably test the implementation with PropEr
> >    ([1]https://github.com/manopapad/proper) to make sure it really
> >    works.
>
> I was not aware of Proper at all! I have wanted to use quick check for
> a long time on projects like sinan. The fact that there is a FOSS
> implementation is a huge deal. Thanks for letting me know.
>
> >    I can write the test specificaction for you if you want to take
> >    that route since I have used QuickCheck at work quite a lot (see
> >    EUC 2010 for the outcome).
>
> I will absolutely take you up on your offer if you are still
> interested once get get the approach hashed out.
>

The testing should not be all that affected by the chosen implementation so
I am up for a bit of fun!


>
> >
> >    The same thing can be done for sets.
> >
> >    But if you have found some clever way to avoid the branching out on
> base
> >    module then I would be very interested in seeing that!
>
> There is one major problem, I think, critical fault in this. That is you
> must know all the possible implementations at the time you are coding
> the are coding the interface. I think this is a killer for a set of
> types. I believe that client developers need the ability to implement
> the interface after the creation of the interface for it to be
> useful.
>
> My first implementation used a similar approach. Though it auto called
> the function in the module defined by base. So your size looked
> something like
>
>   size(#my_map{base = Moule , content=C}) -> Module:size(C).
>
> It retained the ability to be extended at run time, but still required
> that a wrapper be defined to implement the interface. I was coding a
> wrapper for assoc lists as a test when I realized I had just
> reinvented parametrized modules.
>

Hmmmm, I am not quite sure that I see the issue here - could you show me a
bit of your code?

My approach has - as you point out - the drawback that you have to extend it
every time you need a new base module.
This you have to weigh up against benefit of giving xref, dialyzer et al an
easier job.

Using your behaviour based solution one could let new return something like
#my_map{} above and then let a generic wrapper module handle the operations
on the data:

Dict1 = ec_dict:new().
Dict2= gen_dict:add(1,Dict1).

or you could - and probably should - change the instantiation to:

Dict1 = gen_dict:new(dict). %% pick implementation at this point.
Dict2= gen_dict:add(1,Dict1).

This might be a more Erlangish way of doing it since it uses a behaviour as
restriction for the mapping modules and it wraps the indirection with a
simple data structure. (It does make xref unhappy I guess, but dialyzer
should be a happy trooper at this point.)
This might be exactly the same as using parameterized modules, so it would
require an experiment to judge what is the cleaner way of doing it.

I would say that the solution that keeps most of the analysis tools happy is
the one to go with. I have not used the tools with parameterized modules so
I have no clue if it makes it a mess or not.

Cheers,
Torben


>
> >    Cheers,
> >    Torben
> >
> >    On Tue, Mar 15, 2011 at 02:49, Eric Merritt <[2]
> [email protected]>
> >    wrote:
> >
> >      >
> >      > Isn't this just another implementation of a dictionary?
> >      > Aren't there enough of those in stdlib?
> >
> >      I guess I wasn't clear. I have no interest at all in implementing a
> >      dict. We have half a dozen different key value style stores in
> various
> >      libraries distributed with erlang. My goal is to provide a common
> >      interface to those implementations. I have a ton of places I just
> care
> >      to take a 'map' and don't care about the actual implementation. I
> want
> >      to support that use case instead of specializing on gb_trees, or
> dict
> >      or what have you.
> >
> >      > How is this better?
> >
> >      Its not better, simply a common interface to these types.
> >      > At least use the same functions names as in the dict API for
> similar
> >      > functions.
> >      >
> >      > Is the set() type in the specs the same as stdlib:set() or are You
> >      > defining Your own.
> >
> >      sets:set() in stdlib is a specific implementation of a set. I am not
> >      planning on defining sets.
> >      > I am not that happy with the name map. Since it seems to be the
> >      > same thing as what is normally called dictionaries in Erlang I
> >      > think it would be better to use that. Or relation since that is
> what
> >      it
> >      > is in a mathematical sense.
> >
> >      That seems reasonable to me, I think dictionary is probably best
> >      though its a bit more typing. Relation is more correct but I think
> its
> >      probably less intuitive.
> >      > Also map is, at least for me, to to closely coupled to the higher
> >      order
> >      > function. I suspect that before the week is over You are going to
> add
> >      > the function map/2 and that will just look silly map:map(Fun,Map).
> >
> >      Yes I agree in this case. That particular name collision bothered me
> >      as well. This why its good to ask for comments ;)
> >      > See more inline below.
> >      >
> >      > /Anders
> >      >
> >      >>
> >      >> -opaque map(_KeyT, _ValueT) :: term().
> >      >>
> >      >> %% @doc empties the map.
> >      >> -spec clear() -> map().
> >      >>
> >      >
> >      > Shouldn't this be
> >      > -spec new() -> map() instead?
> >      > Or is this really a mutable datatype?
> >      > In which case I suppose it should be
> >      > -spec clear(map()) -> map()
> >      > Or is it just me not understanding parameterized modules?
> >
> >      Semantically its exactly the same. Its probably worth just having
> new,
> >      and discarding clear sense it is less meaningful in a language
> without
> >      mutation.
> >      >> %% @doc returns a boolean indicating if this map has the
> specified
> >      key
> >      >> -spec has(term()) -> boolean().
> >      >
> >      > Should be has_key
> >
> >      Probably to be clear what it actually does.
> >      >
> >      >>
> >      >> %% @doc check to see if the specified value exists in the map
> >      >> -spec has_value(Term()) -> boolean().
> >      >>
> >      >> %% @doc return the key value pairs as a set of {key, value} pairs
> >      >> -spec as_set() -> set().
> >      >>
> >      >> %% @doc get the item from the may
> >      >> -spec get(term()) -> term().
> >      >>
> >      > define types
> >      > key() = term()
> >      > value() = term()
> >      > -spec get(key()) -> value().
> >
> >      Ok, good idea.
> >
> >      >> %% @doc return all keys in the map as a set
> >      >> -spec keys(term()) -> set().
> >      >
> >      > What is the parameter here?
> >
> >      lol, nothing. A fat finger on my part.
> >      >>
> >      >> %% @doc Add a value to the map
> >      >> -spec put(term(), term()) -> map().
> >      >
> >      > -spec add(term(), term()) -> map().
> >      > To match remove
> >      >
> >      >>
> >      >> %% @doc remove a value from the map
> >      >> -spec remove(term()) -> map().
> >      >>
> >      >> %% @doc get the current number of key value pairs in the map
> >      >> -spec size() -> number().
> >      >
> >      > integer() instead of number()
> >
> >      ok. also good recommendation. I will, post a new spec tomorrow along
> >      with the interface and implementation.
> >      >>
> >      >> %% @doc get all the values in the map as a sequence
> >      >> -spec values() -> sequence().
> >      >>
> >      >> As you can tell from the specs I am going to define sets and
> >      sequences
> >      >> very quickly as well.
> >      >>
> >      >> --
> >      >> Eric Merritt
> >      >> Erlang & OTP in Action (Manning) [3]http://manning.com/logan
> >      >> [4]http://twitter.com/ericbmerritt
> >      >> [5]http://erlware.org
> >      >>
> >      >> --
> >      >> You received this message because you are subscribed to the
> Google
> >      Groups "erlware-dev" group.
> >      >> To post to this group, send email to [6]
> [email protected].
> >      >> To unsubscribe from this group, send email to
> >      [7][email protected].
> >      >> For more options, visit this group at
> >      [8]http://groups.google.com/group/erlware-dev?hl=en.
> >      >>
> >      >>
> >      >
> >      > --
> >      > You received this message because you are subscribed to the Google
> >      Groups "erlware-dev" group.
> >      > To post to this group, send email to [9]
> [email protected].
> >      > To unsubscribe from this group, send email to
> >      [10][email protected].
> >      > For more options, visit this group at
> >      [11]http://groups.google.com/group/erlware-dev?hl=en.
> >      >
> >      >
> >
> >      --
> >      You received this message because you are subscribed to the Google
> >      Groups "erlware-dev" group.
> >      To post to this group, send email to [12]
> [email protected].
> >      To unsubscribe from this group, send email to
> >      [13][email protected].
> >      For more options, visit this group at
> >      [14]http://groups.google.com/group/erlware-dev?hl=en.
> >
> >    --
> >    [15]http://www.linkedin.com/in/torbenhoffmann
> >
> > References
> >
> >    Visible links
> >    1. https://github.com/manopapad/proper
> >    2. mailto:[email protected]
> >    3. http://manning.com/logan
> >    4. http://twitter.com/ericbmerritt
> >    5. http://erlware.org/
> >    6. mailto:[email protected]
> >    7. mailto:erlware-dev%[email protected]
> >    8. http://groups.google.com/group/erlware-dev?hl=en
> >    9. mailto:[email protected]
> >   10. mailto:erlware-dev%[email protected]
> >   11. http://groups.google.com/group/erlware-dev?hl=en
> >   12. mailto:[email protected]
> >   13. mailto:erlware-dev%[email protected]
> >   14. http://groups.google.com/group/erlware-dev?hl=en
> >   15. http://www.linkedin.com/in/torbenhoffmann
>
> --
> Eric Merritt
> Erlang & OTP in Action (Manning) http://manning.com/logan
> http://twitter.com/ericbmerritt
> http://erlware.org
>



-- 
http://www.linkedin.com/in/torbenhoffmann

-- 
You received this message because you are subscribed to the Google Groups 
"erlware-dev" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/erlware-dev?hl=en.

Re: Map type

Reply via email to