On Tue, Jun 9, 2015 at 11:53 AM, Dan Andreescu <[email protected]>
wrote:
> Eric, I think we should allow arbitrary querying on any dimension for that
> first data block. We could pre-aggregate all of those combinations pretty
> easily since the dimensions have very low cardinality.
>
Are you thinking about something like
/{project|all}/{agent|all}/{day}/{hour}, or will there be a lot more
dimensions?
> For the article-level data, no, we'd want just basic timeseries querying.
>
> Thanks Gabriel, if you could point us to an example of these secondary
> RESTBase indices, that'd be interesting.
>
The API used to define these tables is described in
https://github.com/wikimedia/restbase/blob/master/doc/TableStorageAPI.md,
and the algorithm used to keep those indexes up to date is described in
https://github.com/wikimedia/restbase-mod-table-cassandra/blob/master/doc/SecondaryIndexes.md
and largely implemented in
https://github.com/wikimedia/restbase-mod-table-cassandra/blob/master/lib/secondaryIndexes.js
.
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics