(Sorry I haven't played with Google Refine 2.0, but I would like to
make some comments)
I'm pretty sure we're on the same page regarding the big vision:
make data more fun, more useful, and easier to deal with. My focus
is on a smaller and more immediate problem: how to let people handle
messy data (without having to resort to programming, or a $500K
enterprise level data analysis package).
I appreciate every work David does, although some of them I haven't
had a chance to play with. Recently when I introduce Linked Data, I
start with some demos of instances of Exhibit. I call it a Raw Data
application, talk about the trait that you can have different views of
the same data. Then I go into data integration, and call out the need
of cross-domain Raw Data. I try to avoid talking about triples and RDF.
I sometimes try to explain that Tabulator and Exhibit share a key
point in terms of infrastructure - client side database, and explains
that the extension version of Tabulator could have been even better
cause every tab shares the same database so it can do cross-domain
data integration. It doesn't work very well because Tabulator doesn't
look as good as Exhibit.
I tried to initiate an effort to embed Exhibit into Tabulator, but
just didn't spend enough time on it so it didn't happen. But I still
think it is a good project to do.
Side questions:
1. Exhibit used to suffer from the problem that if the amount of data
is too big it became very slow because Javascript wasn't very fast at
the time. Will IndexedDB solve this problem? Have anyone tried that?
2. Does anyone know of any good UI for query-by-example, used by
Tabulator (and hence its name)? I still think it's the most overlooked
feature of Tabulator which is theoretically a powerful and general
feature, but it suffered from poor usability.
Cheers,
Kenny