FWIW, I think it’s much easier to index structures if every row has an atomic 
existence that is independent of the table it is currently part of. (This is a 
big part of my interest in moving away from matrix semantics and towards 
relational model semantics.)

It’s a little harder to index DataFrames because the row indices change over 
time, so your index can’t just map values to indices. (Well, it can: but then 
it needs to be updated very frequently: potentially the entire index has to be 
rewritten if you delete the first row of a DataFrame.)

 — John

On Sep 7, 2014, at 10:27 AM, Harlan Harris <[email protected]> wrote:

> This was a feature that sorta existed for a while (see 
> https://github.com/JuliaStats/DataFrames.jl/issues/24 ), but nobody was very 
> happy with it, and I think John ripped it out as part of one of his 
> simplification passes. It's tricky to think about how best to implement this 
> sort of feature when you aspirationally want to support memory-mapped and 
> distributed structures too, and where you want a semantics that's explicitly 
> set-like, cf Pandas or R's data.tables. 
> 
> Also worth thinking about this in the context of John's just-announced goals: 
> https://gist.github.com/johnmyleswhite/ad5305ecaa9de01e317e
> 
> 
> 
> On Sun, Sep 7, 2014 at 12:54 PM, John Myles White <[email protected]> 
> wrote:
> No, DataFrames are not indexed. For now, you’d need to build a wrapper that 
> indexes a DataFrame to get that kind of functionality.
> 
>  — John
> 
> On Sep 7, 2014, at 9:53 AM, Steven Sagaert <[email protected]> wrote:
> 
> > Hi,
> > I was wondering if searching in a dataframe is indexed (in the DB sense, 
> > not array sense. e.g. a tree index structure) or not? If so can you have 
> > multiple indices (on multiple columns) or not?
> 
> 

Reply via email to