Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-10 Thread Laurent Gautier
) # return the list if i is NULL ).




Other standard generics to be affected would be:

* rbind  cbind for 2-dim arrays/matrices: they should combine the
metadata, and for dimension-sensitive metadata can be modelled upon
what is done with dimnames: use rowmeta (colmeta) of the first object
with them in cbind (rbind), and combine colmeta (rowmeta) of all
objects with them, filling with NAs/NULLs/.. for non
metadata-sensitive objects being combined. An issue of coercing
dimmeta of different classes may arise.


May be good to be trigger-happy for a first pass ( stop(mismatching 
meta data - sorry) )... and mix-and-match use cases might be fewer.



* `dim-`, but this may raise the same problem of coercing dimmeta of
different classes.



Disabling dim- is, I think, choosing sanity for now.



...and I agree with the rest of your comments.



Same for me (about your comments).
This thread seems to be leading to something great.


L.



Best,

Enrique

-Original Message- From: Laurent Gautier
[mailto:lgaut...@gmail.com] Sent: jueves, 09 de julio de 2009 14:15 
Cc: Heinz Tuechler; Bengoechea Bartolomé Enrique (SIES 73); Tony

Plate; Henrik Bengtsson; r-devel@r-project.org Subject: Re: [Rd]
Suggestion: Dimension-sensitive attributes

Starting by working on an interface for such object(s) is probably
the first step toward a unified solution, and this before about if
and how R attributes are used.

It would also help to ensure a smooth transition from the existing
classes implementing a similar solution (first the interface is added
to those classes, then after a grace period the classes are
eventually refactored).

Dimension-level is what seems to the be most needed... but I am not
convinced of the practicality of the object-level, and cell-level
scheme s proposed:

- Object-level, if not linked to any dimension-attribute is such
saying that one want to attach anything to any object. That's what
attr() is already doing.

- Cell-level, is may be out-of-scope for one first trial (but may be
I missed the use-cases for it)



If starting with behaviour, it seems to boil to having [/[- and
 dimmeta()/dimmeta-(), :

- extract [ / replace [- :

* keeps working the way it already does

* extracts a subset of the object as well as a subset of the 
dimension-associated metadata.


* departing too much from the way [ is working and add 
behind-the-curtain name matching will only compromise the chances of

 adoption.

* forget about the bit about which metadata is kept and which one 
isn't when using [. Make a function unmeta() (similar behavior to

 unname()) to drop them all, or work it out with something like

dimmeta(x, 1) - NULL # drop the metadata associated with dimension
1


- access the dimension-associated metadata:

* may be a function called dimmeta() (for consistency with 
dimnames()) ? The signature could be dimmeta(x, i), with x the

object, and i the dimension requested. A replace function
dimmeta-(x, i, value) would be provided.


In the abstract the names associated with a given dimension is just
 one of possible metadata, but I'd keep away from meddling with it
for a start.


It would seem natural that metadata associated with one dimension: 
would a table-like object (data.frame seems natural in R, and 
unfortunately there is no data.frame-like structure in R).




L.




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-10 Thread SIES 73
 In the case the metadata are stored in a list, that interface enforces the 
 building of a list.
 (I said to ignore implementation for now, but paradoxically this made me 
 consider possible implementations).

Creating the list on the fly if it's not stored internally as a list should be 
cheap. For example, this is done with data frames, that store dimnames in two 
separate attributes, names and row.names.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-09 Thread SIES 73
I've also had several use cases where I needed cell-like attributes, that is, 
attributes that have the same dimensions as the original array and are 
subsetted in the same way --along all its dimensions.

So we're talking about a way to add metadata to matrices/arrays at 3 possible 
levels:

1) at the whole object level: attributes that are not dropped on 
subsetting 
2) at the dimension level: attributes that behave like dimnames, 
i.e. subsetted along each dimension
3) at the cell level: attributes that are subsetted in the same way 
as the original array

My proposal would be simpler that Tony's suggestion: like dimnames, just have 
reserved attribute names for each case, say objdata, dimdata, and 
celldata (or objattr, dimattr and cellattr).

On the other hand, Tony's pattern would allow as many attributes of each type 
as necessary (some multiplicity is already possible with the simpler design as 
dimdata or celldata could be lists of lists), at the cost of a more complex 
scheme of attributes that needs to be parsed each time.

On Tony's suggestion, attr.keep.on.subset and attr.dimname.like (and 
possible attr.cell.like) could be kept on a single list with 3 elements, 
something like:

 attr(x, attr.subset.with) - list(object=..., dims=..., cells=...)

Would something like this make sense for R-core --either for standard arrays or 
as a new class-- or would it be better implemented in a package?

Enrique

-Original Message-
From: Tony Plate [mailto:tpl...@acm.org] 
Sent: miércoles, 08 de julio de 2009 18:01
To: r-devel@r-project.org
Cc: Bengoechea Bartolomé Enrique (SIES 73); Henrik Bengtsson
Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes

There have been times when I've thought this could be useful too.

One way to go about it could be to introduce a special attribute that controls 
how attributes are dealt with in subsetting, e.g., attr.dimname.like.  The 
contents of this would be character data; on subsetting, any attribute that had 
a name appearing in this vector would be treated as a dimension.  At the same 
time, it might be nice to also introduce attr.keep.on.subset, which would 
specify which attributes should be kept on the result of a subsetting operation 
(could be useful for attributes that specify units).  This of course could be a 
way of implementing Henrik's suggestion: dimattr(x, misc) - value would add 
misc to the attr.dimname.like attribute and also set the attribute 
misc.  The tricky part would be modifying the [ methods.   However, 
the most useful would probably be the one for ordinary matrices and arrays, and 
others could be modified when and if their maintainers see the need.

-- Tony Plate

Bengoechea Bartolomé Enrique (SIES 73) wrote:
 Hi,

 I agree with Henrik that his suggestion to have dimension vector attributes 
 working like dimnames (see below) would be an extremely useful infrastructure 
 adittion to R.

 If this is not considered for R-core, I am happy to try to implement this in 
 a package, as a new class. And possibly do the same thing for data frames. 
 Should you have any comments, ideas or suggestions about it, please share!

 Best,

 Enrique

 --
 ---
 Subject: 
 From: Henrik Bengtsson hb_at_stat.berkeley.edu
 Date: Sun, 07 Jun 2009 14:42:08 -0700

 Hi,

 maybe this has been suggested before, but would it be possible, without not 
 breaking too much existing code, to add other dimension vector attributes 
 in addition to 'dimnames'? These attributes would then be subsetted just like 
 dimnames. 

 Something like this: 

   
 x - array(1:30, dim=c(2,3,5))
 dimnames(x) - list(c(a, b), c(a1, a2, a3), NULL); 
 dimattr(x, misc) - list(1:2, list(x=1:5, y=letters[1:8], z=NA), 
 letters[1:5]);
 


   
 y - x[,1:2,2:3]
 str(dimnames(y))
 

 List of 3 

  $ : chr [1:2] a b
  $ : chr [1:2] a1 a2
  $ : NULL


   
 str(dimattr(x, misc)) 
 

 List of 3 
  $ : int [1:2] 1 2 
  $ :List of 2 
   ..$ x: int [1:5] 1 2 3 4 5 
   ..$ y: chr [1:8] a b c d ... 
  $ : chr [1:2] b c 

  I can imagine this needs to be added in several places and functions such as 
 is.vector() needs to be updated etc. It is not a quick migration, but is it 
 something worth considering for the future? 

 /Henrik 

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

   

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-09 Thread SIES 73
 If objattr, dimattr and cellattr are lists, they would offer save 
 places for all attributes that should be kept on subsetting. 

My proposed design would be that:

* objattr would be a list of attributes (just preserved on subsetting)
* dimattr would be a list with as many elements as array dimensions. 
Each element can be any object whose length matches the corresponding array 
dimension's length and that can be itself subsetted with [: so it could be a 
vector, a list, a data frame...
* cellattr would be any object whose dimensions match the array 
dimensions: another array, a data frame...

 In my view this would be very useful, because that way a general solution for 
 data description, like variabel names, variable labels, units, ... could be 
 reached.

Indeed, that's the objective: attaching user-defined metadata that is 
automatically synchronized with subsetting operations to the actual data.

I've had dozens of use cases on my own R programs that needed this type of 
pattern, and seen it implemented in different ways in several classes (xts, 
timeSeries, AnnotatedDataFrame, etc.) As you point, this could offer a unified 
design for a common need.

Enrique

-Original Message-
From: Heinz Tuechler [mailto:tuech...@gmx.at] 
Sent: jueves, 09 de julio de 2009 10:56
To: Bengoechea Bartolomé Enrique (SIES 73); Tony Plate; r-devel@r-project.org
Cc: Henrik Bengtsson
Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes

At 10:01 09.07.2009, SIES 73 wrote:
I've also had several use cases where I needed cell-like attributes, 
that is, attributes that have the same dimensions as the original array 
and are subsetted in the same way --along all its dimensions.

So we're talking about a way to add metadata to matrices/arrays at 3 
possible levels:

 1) at the whole object level: 
 attributes that are not dropped on subsetting
 2) at the dimension level: attributes that behave like 
 dimnames, i.e. subsetted along each dimension
 3) at the cell level: attributes that are subsetted in the 
 same way as the original array

My proposal would be simpler that Tony's
suggestion: like dimnames, just have reserved attribute names for 
each case, say objdata, dimdata, and celldata (or objattr, 
dimattr and cellattr).

If objattr, dimattr and cellattr are lists, they would offer save places 
for all attributes that should be kept on subsetting. In my view this would be 
very useful, because that way a general solution for data description, like 
variabel names, variable labels, units, ... could be reached.


On the other hand, Tony's pattern would allow as many attributes of 
each type as necessary (some multiplicity is already possible with the 
simpler design as dimdata or celldata could be lists of lists), at the 
cost of a more complex scheme of attributes that needs to be parsed 
each time.

On Tony's suggestion, attr.keep.on.subset and attr.dimname.like 
(and possible
attr.cell.like) could be kept on a single list with 3 elements, 
something like:

  attr(x, attr.subset.with) - list(object=..., dims=..., cells=...)

Would something like this make sense for R-core --either for standard 
arrays or as a new class-- or would it be better implemented in a 
package?

Enrique


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-09 Thread Heinz Tuechler

At 11:14 09.07.2009, SIES 73 wrote:
 If objattr, dimattr and cellattr are 
lists, they would offer save places for all 
attributes that should be kept on subsetting.


My proposed design would be that:

* objattr would be a list of 
attributes (just preserved on subsetting)
* dimattr would be a list with as 
many elements as array dimensions. Each element 
can be any object whose length matches the 
corresponding array dimension's length and that 
can be itself subsetted with [: so it could 
be a vector, a list, a data frame...
* cellattr would be any object whose 
dimensions match the array dimensions: another array, a data frame...


 In my view this would be very useful, because 
that way a general solution for data 
description, like variabel names, variable labels, units, ... could be reached.


Indeed, that's the objective: attaching 
user-defined metadata that is automatically 
synchronized with subsetting operations to the actual data.


I've had dozens of use cases on my own R 
programs that needed this type of pattern, and 
seen it implemented in different ways in several 
classes (xts, timeSeries, AnnotatedDataFrame, 
etc.) As you point, this could offer a unified design for a common need.


Enrique



For my personal use it was sufficient to create a 
class called documented with a corresponding 
subsetting method and one attribute, also called 
documented. This attribute may contain 
'varlabel', 'varname', 'value.labels', 
'missing.values', 'code.ordered', 'comment', ...

It is copied on subsetting.
I think attributes concerning e.g. dimensions, 
i.e. parts of an object should stay in this 
object-related attribute and be extracted on 
subsetting. Since subsetting an object leads to a 
new object, this could then have its own, new persisting attribute.

The more difficult part may to be the binding of objects.

Heinz





-Original Message-
From: Heinz Tuechler [mailto:tuech...@gmx.at]
Sent: jueves, 09 de julio de 2009 10:56
To: Bengoechea Bartolomé Enrique (SIES 73); Tony Plate; r-devel@r-project.org
Cc: Henrik Bengtsson
Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes

At 10:01 09.07.2009, SIES 73 wrote:
I've also had several use cases where I needed cell-like attributes,
that is, attributes that have the same dimensions as the original array
and are subsetted in the same way --along all its dimensions.

So we're talking about a way to add metadata to matrices/arrays at 3
possible levels:

 1) at the whole object level:
 attributes that are not dropped on subsetting
 2) at the dimension level: attributes that behave like
 dimnames, i.e. subsetted along each dimension
 3) at the cell level: attributes that are subsetted in the
 same way as the original array

My proposal would be simpler that Tony's
suggestion: like dimnames, just have reserved attribute names for
each case, say objdata, dimdata, and celldata (or objattr,
dimattr and cellattr).

If objattr, dimattr and cellattr are 
lists, they would offer save places for all 
attributes that should be kept on subsetting. In 
my view this would be very useful, because that 
way a general solution for data description, 
like variabel names, variable labels, units, ... could be reached.



On the other hand, Tony's pattern would allow as many attributes of
each type as necessary (some multiplicity is already possible with the
simpler design as dimdata or celldata could be lists of lists), at the
cost of a more complex scheme of attributes that needs to be parsed
each time.

On Tony's suggestion, attr.keep.on.subset and attr.dimname.like
(and possible
attr.cell.like) could be kept on a single list with 3 elements,
something like:

  attr(x, attr.subset.with) - list(object=..., dims=..., cells=...)

Would something like this make sense for R-core --either for standard
arrays or as a new class-- or would it be better implemented in a
package?

Enrique



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-09 Thread Laurent Gautier
Starting by working on an interface for such object(s) is probably the 
first step toward a unified solution, and this before about if and how R 
attributes are used.


It would also help to ensure a smooth transition from the existing 
classes implementing a similar solution (first the interface is added to 
those classes, then after a grace period the classes are eventually 
refactored).


Dimension-level is what seems to the be most needed... but I am not 
convinced of the practicality of the object-level, and cell-level scheme 
s proposed:


- Object-level, if not linked to any dimension-attribute is such saying 
that one want to attach anything to any object. That's what attr() is 
already doing.


- Cell-level, is may be out-of-scope for one first trial (but may be I 
missed the use-cases for it)




If starting with behaviour, it seems to boil to having [/[- and 
dimmeta()/dimmeta-(), :


- extract [ / replace [- :

  * keeps working the way it already does

  * extracts a subset of the object as well as a subset of the 
dimension-associated metadata.


  * departing too much from the way [ is working and add 
behind-the-curtain name matching will only compromise the chances of 
adoption.


  * forget about the bit about which metadata is kept and which one 
isn't when using [. Make a function unmeta() (similar behavior to 
unname()) to drop them all, or work it out with something like

 dimmeta(x, 1) - NULL # drop the metadata associated with dimension 1

- access the dimension-associated metadata:

  * may be a function called dimmeta() (for consistency with 
dimnames()) ? The signature could be dimmeta(x, i), with x the object, 
and i the dimension requested. A replace function dimmeta-(x, i, 
value) would be provided.



In the abstract the names associated with a given dimension is just 
one of possible metadata, but I'd keep away from meddling with it for a 
start.



It would seem natural that metadata associated with one dimension:
would a table-like object (data.frame seems natural in R, and 
unfortunately there is no data.frame-like structure in R).




L.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-09 Thread SIES 73
Very good points. They closely match the current prototype I have written...

 Starting by working on an interface for such object(s) is probably the first 
 step toward a unified solution

Agree. Getting a good API is always the most important step.

 Dimension-level is what seems to the be most needed...

True, and that was Henrik's original suggestion. But I find all three are 
closely related to the same topic (metadata) and as such deserve to be worked 
out together, but if most people agree otherwise, the direction is clear.

 - Object-level, if not linked to any dimension-attribute is such saying that 
 one want to attach anything to any object. That's what attr() is already 
 doing.

Except that plain attributes are dropped when subsetting. I've found myself 
dozens of times creating classes must to create a `[` method for them that 
preserves some attributes. This looks like such a common situation that having 
a mechanism to avoid the user programming the same stuff again and again would 
be handy.

 - Cell-level, is may be out-of-scope for one first trial (but may be I missed 
 the use-cases for it)

Although I agree that cell-level is far less common, here are a couple of use 
cases I've hit recently:

1) the array represents time series in columns. The original data comes in a 
different frequency for each column, with some data missing. When you align to 
a common frequency and interpolate missing values, I needed a factor array of 
the same dimension as the data array identifying whether each observation 
corresponded to the actual original series, or had been interpolated, and 
whether interpolation was due to missing data or to frequency alignment. Of 
course, I needed the factor array to be subsetted together with the array.

2) the array is a table representing data to be formatted by a reporting system 
(Sweave, R2HTML, etc), similar to the 'xtable' class. So I needed to associate 
formatting information to each individual cell (font, color, borders...), as 
well to each dimension and to the whole table.

Anyway, it's far easier to add cell-level metadata on top of the other 
features with a new class: for `[` subscripting just call NextMethod() and then 
apply the same indexes to the object storing the cell-level metadata. But I 
still think it's useful to work out data object's metadata at all possible 
levels with a unified interface.


About the subscripting `[` methods, I don't see the need to modify `[-` for 
arrays, as out-of-bound indexes generate errors with arrays (unlike vectors or 
data frames), so `[-` would only replace data and leave metadata untouched. Am 
I missing something? 

 may be a function called dimmeta() (for consistency with dimnames()) ? 

I'm using 'dimdata' in my current prototype, and Henrik suggested 'dimattr', 
but I really like your proposal more. 

Wrappers to the two first elements of 'dimmeta' for 2-dim arrays could be added 
in the same vein as 'rownames' and 'colnames': 'rowmeta' and 'colmeta'.

 The signature could be dimmeta(x, i), with x the object, 

For consistency with 'dimnames', the 'i' argument could be dropped and use 
dimmeta(x)[[i]] instead...


Other standard generics to be affected would be:

 * rbind  cbind for 2-dim arrays/matrices: they should combine the metadata, 
and for dimension-sensitive metadata can be modelled upon what is done with 
dimnames: use rowmeta (colmeta) of the first object with them in cbind (rbind), 
and combine colmeta (rowmeta) of all objects with them, filling with 
NAs/NULLs/.. for non metadata-sensitive objects being combined. An issue of 
coercing dimmeta of different classes may arise.

 * `dim-`, but this may raise the same problem of coercing dimmeta of 
different classes.


...and I agree with the rest of your comments.

Best,

Enrique

-Original Message-
From: Laurent Gautier [mailto:lgaut...@gmail.com] 
Sent: jueves, 09 de julio de 2009 14:15
Cc: Heinz Tuechler; Bengoechea Bartolomé Enrique (SIES 73); Tony Plate; Henrik 
Bengtsson; r-devel@r-project.org
Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes

Starting by working on an interface for such object(s) is probably the first 
step toward a unified solution, and this before about if and how R attributes 
are used.

It would also help to ensure a smooth transition from the existing classes 
implementing a similar solution (first the interface is added to those classes, 
then after a grace period the classes are eventually refactored).

Dimension-level is what seems to the be most needed... but I am not convinced 
of the practicality of the object-level, and cell-level scheme s proposed:

- Object-level, if not linked to any dimension-attribute is such saying that 
one want to attach anything to any object. That's what attr() is already doing.

- Cell-level, is may be out-of-scope for one first trial (but may be I missed 
the use-cases for it)



If starting with behaviour, it seems to boil to having [/[- and 
dimmeta()/dimmeta

Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-09 Thread SIES 73
Forgot to answer this one:

 It would seem natural that metadata associated with one dimension:
 would a table-like object  

Right. A data frame has the problem that for most use cases one would want that 
each dimension length matches the *rows* of the data frame instead of the 
columns, but it is the columns what we would have for free when allowing 
dimmeta elements to be lists...

Enrique

-Original Message-
From: Laurent Gautier [mailto:lgaut...@gmail.com] 
Sent: jueves, 09 de julio de 2009 14:15
Cc: Heinz Tuechler; Bengoechea Bartolomé Enrique (SIES 73); Tony Plate; Henrik 
Bengtsson; r-devel@r-project.org
Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes

Starting by working on an interface for such object(s) is probably the first 
step toward a unified solution, and this before about if and how R attributes 
are used.

It would also help to ensure a smooth transition from the existing classes 
implementing a similar solution (first the interface is added to those classes, 
then after a grace period the classes are eventually refactored).

Dimension-level is what seems to the be most needed... but I am not convinced 
of the practicality of the object-level, and cell-level scheme s proposed:

- Object-level, if not linked to any dimension-attribute is such saying that 
one want to attach anything to any object. That's what attr() is already doing.

- Cell-level, is may be out-of-scope for one first trial (but may be I missed 
the use-cases for it)



If starting with behaviour, it seems to boil to having [/[- and 
dimmeta()/dimmeta-(), :

- extract [ / replace [- :

   * keeps working the way it already does

   * extracts a subset of the object as well as a subset of the 
dimension-associated metadata.

   * departing too much from the way [ is working and add 
behind-the-curtain name matching will only compromise the chances of 
adoption.

   * forget about the bit about which metadata is kept and which one 
isn't when using [. Make a function unmeta() (similar behavior to 
unname()) to drop them all, or work it out with something like
  dimmeta(x, 1) - NULL # drop the metadata associated with dimension 1

- access the dimension-associated metadata:

   * may be a function called dimmeta() (for consistency with 
dimnames()) ? The signature could be dimmeta(x, i), with x the object, 
and i the dimension requested. A replace function dimmeta-(x, i, 
value) would be provided.


In the abstract the names associated with a given dimension is just 
one of possible metadata, but I'd keep away from meddling with it for a 
start.


It would seem natural that metadata associated with one dimension:
would a table-like object (data.frame seems natural in R, and 
unfortunately there is no data.frame-like structure in R).



L.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-09 Thread Laurent Gautier

Bengoechea Bartolomé Enrique (SIES 73) wrote:

Forgot to answer this one:

It would seem natural that metadata associated with one dimension: 
would a table-like object


[thanks for reading through what seems much like a telescoped sentences]


Right. A data frame has the problem that for most use cases one would
want that each dimension length matches the *rows* of the data frame
instead of the columns, but it is the columns what we would have for
free when allowing dimmeta elements to be lists...


Think of one data.frame per dimension and each data.frame having its 
rows aligned along that dimension.


In the case of a matrix, the dim-1 data.frame would have as many rows as 
rows in the matrix and the dim-2 data.frame would have as many rows as 
columns in the matrix.


When thinking in terms of generalization, one can also note that the 
one-dimension case can already be modelled by a data.frame.



L.


Enrique

-Original Message- From: Laurent Gautier
[mailto:lgaut...@gmail.com] Sent: jueves, 09 de julio de 2009 14:15 
Cc: Heinz Tuechler; Bengoechea Bartolomé Enrique (SIES 73); Tony

Plate; Henrik Bengtsson; r-devel@r-project.org Subject: Re: [Rd]
Suggestion: Dimension-sensitive attributes

Starting by working on an interface for such object(s) is probably
the first step toward a unified solution, and this before about if
and how R attributes are used.

It would also help to ensure a smooth transition from the existing
classes implementing a similar solution (first the interface is added
to those classes, then after a grace period the classes are
eventually refactored).

Dimension-level is what seems to the be most needed... but I am not
convinced of the practicality of the object-level, and cell-level
scheme s proposed:

- Object-level, if not linked to any dimension-attribute is such
saying that one want to attach anything to any object. That's what
attr() is already doing.

- Cell-level, is may be out-of-scope for one first trial (but may be
I missed the use-cases for it)



If starting with behaviour, it seems to boil to having [/[- and
 dimmeta()/dimmeta-(), :

- extract [ / replace [- :

* keeps working the way it already does

* extracts a subset of the object as well as a subset of the 
dimension-associated metadata.


* departing too much from the way [ is working and add 
behind-the-curtain name matching will only compromise the chances of

 adoption.

* forget about the bit about which metadata is kept and which one 
isn't when using [. Make a function unmeta() (similar behavior to

 unname()) to drop them all, or work it out with something like

dimmeta(x, 1) - NULL # drop the metadata associated with dimension
1


- access the dimension-associated metadata:

* may be a function called dimmeta() (for consistency with 
dimnames()) ? The signature could be dimmeta(x, i), with x the

object, and i the dimension requested. A replace function
dimmeta-(x, i, value) would be provided.


In the abstract the names associated with a given dimension is just
 one of possible metadata, but I'd keep away from meddling with it
for a start.


It would seem natural that metadata associated with one dimension: 
would a table-like object (data.frame seems natural in R, and 
unfortunately there is no data.frame-like structure in R).




L.




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-08 Thread SIES 73
Hi,

I agree with Henrik that his suggestion to have dimension vector attributes 
working like dimnames (see below) would be an extremely useful infrastructure 
adittion to R.

If this is not considered for R-core, I am happy to try to implement this in a 
package, as a new class. And possibly do the same thing for data frames. Should 
you have any comments, ideas or suggestions about it, please share!

Best,

Enrique

-
Subject: 
From: Henrik Bengtsson hb_at_stat.berkeley.edu 
Date: Sun, 07 Jun 2009 14:42:08 -0700

Hi, 

maybe this has been suggested before, but would it be possible, without not 
breaking too much existing code, to add other dimension vector attributes in 
addition to 'dimnames'? These attributes would then be subsetted just like 
dimnames. 

Something like this: 

 x - array(1:30, dim=c(2,3,5)) 
 dimnames(x) - list(c(a, b), c(a1, a2, a3), NULL); 
 dimattr(x, misc) - list(1:2, list(x=1:5, y=letters[1:8], z=NA), 
 letters[1:5]); 


 y - x[,1:2,2:3] 
 str(dimnames(y)) 

List of 3 

 $ : chr [1:2] a b
 $ : chr [1:2] a1 a2
 $ : NULL


 str(dimattr(x, misc)) 

List of 3 
 $ : int [1:2] 1 2 
 $ :List of 2 
  ..$ x: int [1:5] 1 2 3 4 5 
  ..$ y: chr [1:8] a b c d ... 
 $ : chr [1:2] b c 

 I can imagine this needs to be added in several places and functions such as 
is.vector() needs to be updated etc. It is not a quick migration, but is it 
something worth considering for the future? 

/Henrik 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-08 Thread Tony Plate

There have been times when I've thought this could be useful too.

One way to go about it could be to introduce a special attribute that 
controls how attributes are dealt with in subsetting, e.g., 
attr.dimname.like.  The contents of this would be character data; on 
subsetting, any attribute that had a name appearing in this vector would 
be treated as a dimension.  At the same time, it might be nice to also 
introduce attr.keep.on.subset, which would specify which attributes 
should be kept on the result of a subsetting operation (could be useful 
for attributes that specify units).  This of course could be a way of 
implementing Henrik's suggestion: dimattr(x, misc) - value would add 
misc to the attr.dimname.like attribute and also set the attribute 
misc.  The tricky part would be modifying the [ methods.   However, 
the most useful would probably be the one for ordinary matrices and 
arrays, and others could be modified when and if their maintainers see 
the need.


-- Tony Plate

Bengoechea Bartolomé Enrique (SIES 73) wrote:

Hi,

I agree with Henrik that his suggestion to have dimension vector attributes 
working like dimnames (see below) would be an extremely useful infrastructure adittion to 
R.

If this is not considered for R-core, I am happy to try to implement this in a 
package, as a new class. And possibly do the same thing for data frames. Should 
you have any comments, ideas or suggestions about it, please share!

Best,

Enrique

-
Subject: 
From: Henrik Bengtsson hb_at_stat.berkeley.edu 
Date: Sun, 07 Jun 2009 14:42:08 -0700


Hi, 

maybe this has been suggested before, but would it be possible, without not breaking too much existing code, to add other dimension vector attributes in addition to 'dimnames'? These attributes would then be subsetted just like dimnames. 

Something like this: 

  
x - array(1:30, dim=c(2,3,5)) 
dimnames(x) - list(c(a, b), c(a1, a2, a3), NULL); 
dimattr(x, misc) - list(1:2, list(x=1:5, y=letters[1:8], z=NA), letters[1:5]); 




  
y - x[,1:2,2:3] 
str(dimnames(y)) 



List of 3 


 $ : chr [1:2] a b
 $ : chr [1:2] a1 a2
 $ : NULL


  
str(dimattr(x, misc)) 



List of 3 
 $ : int [1:2] 1 2 
 $ :List of 2 
  ..$ x: int [1:5] 1 2 3 4 5 
  ..$ y: chr [1:8] a b c d ... 
 $ : chr [1:2] b c 

 I can imagine this needs to be added in several places and functions such as is.vector() needs to be updated etc. It is not a quick migration, but is it something worth considering for the future? 

/Henrik 


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Suggestion: Dimension-sensitive attributes

2009-06-07 Thread Henrik Bengtsson
Hi,

maybe this has been suggested before, but would it be possible,
without not breaking too much existing code, to add other dimension
vector attributes in addition to 'dimnames'?  These attributes would
then be subsetted just like dimnames.

Something like this:

 x - array(1:30, dim=c(2,3,5))
 dimnames(x) - list(c(a, b), c(a1, a2, a3), NULL);
 dimattr(x, misc) - list(1:2, list(x=1:5, y=letters[1:8], z=NA), 
 letters[1:5]);

 y - x[,1:2,2:3]
 str(dimnames(y))
List of 3
 $ : chr [1:2] a b
 $ : chr [1:2] a1 a2
 $ : NULL
 str(dimattr(x, misc))
List of 3
 $ : int [1:2] 1 2
 $ :List of 2
  ..$ x: int [1:5] 1 2 3 4 5
  ..$ y: chr [1:8] a b c d ...
 $ : chr [1:2] b c

 I can imagine this needs to be added in several places and functions
such as is.vector() needs to be updated etc.  It is not a quick
migration, but is it something worth considering for the future?

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel