#18376: New encoding structure for linear codes
-------------------------+-------------------------------------------------
       Reporter:         |        Owner:
  dlucas                 |       Status:  needs_work
           Type:         |    Milestone:  sage-6.8
  enhancement            |   Resolution:
       Priority:  major  |    Merged in:
      Component:         |    Reviewers:
  coding theory          |  Work issues:
       Keywords:         |       Commit:
        Authors:  David  |  aa42238c6754cff63c5cc42e8d77e78b36074fbc
  Lucas                  |     Stopgaps:
Report Upstream:  N/A    |
         Branch:         |
  u/dlucas/encoder       |
   Dependencies:         |
-------------------------+-------------------------------------------------
Changes (by ncohen):

 * status:  needs_review => needs_work


Comment:

 Hellooooooo again,

 It seems that your branch has become uncompatible with the latest beta.
 Turns
 out that it is my fault: #18926 `^^;`

 > >`encoder_default_name` -> `default_encoder`?
 >
 > I kept `encoder_default_name` as it's the name of the encoder (the
 string that
 > points to it), and not the object in itself.

 I see. Would it be possible to pick 'default_encoder_name' then? It sounds
 "more
 english" to say "the default encoder's name". Otherwise it seems that you
 are
 talking of the default name of an encoder.

 > This encoder is designed to put generic linear codes (`LinearCode`
 class) into
 > the new framework. And in `LinearCode`, you have a class parameter
 > `_generator_matrix` because one has to provide a generator matrix when
 he
 > build a "structureless" linear code.  As `generator_matrix()` method in
 > `AbstractLinearCode` returns the generator matrix given by the default
 > encoder, which is, for these `LinearCode`s the one you pointed out, if
 this
 > encoder had this `self.code().generator_matrix()` in its
 `generator_matrix()`
 > method, it will lead to an infinite loop when calling
 > `LinearCode.generator_matrix()`.

 I do not think that I follow this hierarchy of classes and calls between
 them as
 well as you do. My point here is that the code tries to meddle with the
 private
 attributes of another class, and I thought that it could be easily
 avoided. I
 can understand that there would be an infinite loop somewhere if you only
 did
 the replacement I proposed, so what about *moving* this cache check to the
 `.generator_matrix` method of the `code()` object? In this situation, you
 would
 be able to perform the replacement I proposed, and the only method which
 would
 try to detect private attributes would do so *in its own class* which is a
 bit
 'cleaner'.

 > Actually, the information set can be computed from any generator matrix,
 and
 > it will always return the same result whatever the generator matrix I
 used to
 > compute it. As the default encoder of a code has a valid
 `generator_matrix`
 > method, and as `C.generator_matrix()` calls the default encoder, the
 code of
 > `information_set()` will always return a valid result.

 Agreed for validity, now what about speed? I agree to let it lie, but
 having
 these things appear is a sign that those methods are exposed too high,
 especially when you want to handle several encoders at once.

 > >Really, these methods `encode` and `unencode` have nothing to do in
 > >`LinearCode`. You should "accept your own design" and have all these
 things
 > >be available only in the `Encoder` object, otherwise there will be a
 crazy
 > >amount of copy/paste with those arguments.
 >
 > I do think they should be there. A lot of users won't care about the
 kind of
 > encoding/decoding which will be used underneath as long as they get what
 they
 > expect.

 That's hardly a problem: they can do `encoder = my_code.encoder()` and
 then play
 with the encoder. They can even do `my_code.encoder().encode()` as this
 thing is
 cached.

 > For these users, being able to directly call `MyCode.encode(m)` or
 > `MyCode.unencode(c)` instead of having to ask for the list of available
 > encoders, pick the one they want, create it, and call `encode` or
 `unencode`
 > on it is a really helpful feature.

 That's not true, as the example above shows. They do not have to go
 through the
 list, or understand it, or anything else. Your design here is leading you
 to
 copy the functions of an encoder to a higher-level class, and it does not
 sound
 at all like a good idea from the programming point of view. From the
 user's
 point of view this is also debattable, for having users call 'encoder()'
 makes
 them aware that several encoders are available.

 You designed your code to have Encoder objects, and quite naturally that's
 where
 the `encode/unencode` methods live. If you distinguish a code and its
 encoder,
 you should accept the consequences. And they are not so bad on the user's
 side,
 for (s)he can create the encoder without making any choice, i.e. by
 calling
 `.encoder()`.

 > >I do not understand why the `Encoder` class provides methods to
 encode/decode
 > >linear codes from the matrix, given that you cannot be sure that a
 matrix is
 > >available. Shouldn't this be in `LinearCodeGeneratorMatrixEncoder`?
 >
 > It actually provides a default implementation that will avoid spending
 time
 > (and repeating code) in case of you have a generator matrix for your
 code. If
 > you know a generator matrix for the encoder you're writing, you just
 have to
 > implement `generator_matrix` and you get immediately `encode` and
 `unencode`.
 >
 > If you don't have one, you have to override these methods in your
 > encoder. It's just a way to make developer's life easier `:)`

 I did not mean that the code should be replaced, but that it should be
 moved
 somewhere else. You define methods which assume *more* on the object than
 what
 it should (i.e. that the matrix is available). If it is instanciated
 without
 that matrix these functions will break.

 What i suggest is to move them to a class (inheriting from Encoder) whose
 name
 makes it clear that the generator matrix should be available. In this
 situation,
 anybody creating a class that inherits from that other class will do it in
 full
 knowledge that a matrix has to be available, and does not run the risk of
 inheriting broken methods.

 You can pick any name you like for that other class, of course.

 > In `AbstractLinearCode`, we already provide a method called `encode` and
 > another called `encoder`. I think adding a third one called `encoders`
 might
 > lead to some confusion amongst the users, especially as `encoder`
 returns an
 > `Encoder` object, one might think `encoders` will return a list of all
 > possible `Encoder`s objects available for the provided code. Plus, I
 think the
 > `_available` actually adds value, by meaning "this is the list of all
 encoders
 > you can access with your linear code".

 I disagree with the implementation choice and the explanations, but I am
 ready
 to classify it under "matters of taste", in which case I don't have any
 reason
 to force mine, given that you are the one who writes the code.

 > About the dictionary, if I just want to build a specific encoder by
 using
 > `C.encoder(name='mySpecificName`), all I care about is actually the list
 of
 > names of all the encoders my `C` can access to. I think returning the
 whole
 > dictionary might be a bit confusing,

 Okay, accepted under 'confusing' when the user asks for the names only,
 and
 cannot be assumed to understand more technicality.

 When the users explicitly asks for both names and classes, however, I see
 little
 interest in returning the list of `.items()` instead of the dictionary.
 Would
 you have anything against returning a copy of the dict in this case?

 > Now, about your idea of having a `set_encoder` method. There (imho)
 several
 > problems with that, namely:
 >
 > - the existence of `_default_encoder_name` as a class argument allow all
 the
 >   methods of `LinearCode` that rely on a generator matrix to work.
 Indeed, as
 >   they call `generator_matrix()` without any argument in their bodies,
 and as
 >   we always set an encoder which knows a generator matrix as a default
 >   encoder, all these methods will always work. Now, if you allow the
 user to
 >   dynamically change this, and if he changes the encoder to another
 which does
 >   not has a generator matrix, all these methods will fail... Which is a
 bit
 >   silly, because what we only care about when the message space is not a
 >   vector space are the differences when it comes to encoding/unencoding,
 we
 >   don't want that to change the behaviour of other methods.

 You convinced me that some encoders have features that others do not have,
 and
 that for this reason it made no sense to have a 'main' encoder in the
 class.

 > - also, it will lead to the deletion of these "name" arguments, as you
 >   said. So if I want to test different encoding styles (or later on,
 decoding
 >   algorithms) I will have to reset a new encoder everytime.

 Not true. You would create several encoders, and use them directly.

 Have fun,

 Nathann

--
Ticket URL: <http://trac.sagemath.org/ticket/18376#comment:18>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/d/optout.

Reply via email to