On Tuesday, July 19, 2016 at 12:18:17 PM UTC-5, Andreas Noack wrote:
>
> It seems to me that contrasts should be defined in defined in the array 
> packages and not in DataFrames. We'd probably need the functions to be 
> defined in an upstream package like StatsBase or (ArrayBase/DataBase?) such 
> that all array packages can extend them.
>

That's the approach that makes the most sense to me too.  Right now 
CategoricalArrays only requires Compat and it does not seem that Milan is 
available to make changes in it.

We have the usual problem of optional dependencies. Should DataFrames 
> depend on any data array package or all of them? Is it possible the 
> DataFrames doesn't use any features of concrete data array types and only 
> define methods for abstract types? Then the user would have to load a 
> specific array package. This might be a bit demanding to keep working and 
> from a user perspective, a single good implementation might be better.
>
> What are the specific issues you are having right now? Are the things that 
> are broken things that used to work or is work in progress towards using 
> Nullable and Categorical arrays?
>

I was trying to use CategoricalArrays and failing.  This only affects 
PooledDataArrays and CategoricalArrays but there are other aspects like the 
termnames methods, whose generic is currently defined in DataFrames, but is 
linked to the contrasts.

Ultimately if PooledDataArray is replaced by CategoricalArray then these 
generics can all go into CategoricalArrays.  It would be necessary to have 
DataFrames require CategoricalArrays but I suspect that would happen anyway.

In a way I would like to split the Formula/Terms/ModelFrame/ModelMatrix 
material into a separate package but that package would need to depend on 
DataFrames so it wouldn't buy us much.

>
> On Tue, Jul 19, 2016 at 12:23 PM, Douglas Bates <[email protected] 
> <javascript:>> wrote:
>
>> Yes, thanks to Tony, Andreas, Milan and others who worked on this.
>>
>> At the risk of making myself unpopular I would like to return to the 
>> issue of ModelFrame, ModelMatrix, etc. because a lot of code is still 
>> broken for me.  At present `DataFrames/REQUIRE` lists `DataArrays 0,3.4` 
>> but neither `NullableArrays` nor `CategoricalArrays`.  Contrasts are 
>> defined in  `DataFrames/src/statsmodels/formula..jl` but we would need to 
>> require `CategoricalArrays` if contrasts for that type were to be defined 
>> there.  To me it would make more sense to define the contrasts where the 
>> array types are defined.
>>
>> I can add `CategoricalArrays` to `DataFrames/REQUIRE` to get ModelMatrix 
>> working again but that might have a knock-on effect for many packages that 
>> require `DataFrames`.
>>
>> Although I'd really like to get ModelMatrix working again, I don't want 
>> to make changes like DataFrames requiring CategoricalArrays that later need 
>> to be backed out.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "julia-stats" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to