Hi there --

I've been doing some analysis using the raw pageviews table in Hive, in
order to try to understand the effect that adding a sitemap to
it.wikipedia.org had on traffic[1].  As part of this analysis, I created
three temporary tables.  But, of course, those tables only exist within the
context of my own session, which is sub-optimal since I'm not the only one
trying to understand this.

What's the best way to go about persisting these tables?  I can SELECT INTO
to move the data in to a non-temp table, but don't want to do so
willy-nilly.

(They'll probably need to stick around for about 2 weeks, I would guess,
and each of the three tables in question is about 5 million rows with three
columns each (a string, and two int))

Thanks!

- Ian
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to