Thanks. I'll try the lists. Completely forgot about them actually
On Mon, Apr 15, 2013 at 12:59 PM, Jim Klo <[email protected]> wrote: > Not sure if its ideal but if you need dates in epoch millis, you could > round the timestamp to the floor of the current day (say midnight) in a map > function, use a built in reduce... Then use a list function to filter > unique countries. > > If you don't need a real timestamp value, use an integer like YYYYMMDD > (i.e. 20130710 for 2013-Jul-10). > > Reduce = true will combine by day making at most (196 countries x number > of days in range) to filter in the show function. > > - JK > > > > Sent from my iPad > > On Apr 14, 2013, at 6:38 PM, "Andrey Kuprianov" < > [email protected]> wrote: > > > Hi guys, > > > > Just for the sake of a debate. Here's the question. There are > transactions. > > Among all other attributes there's timestamp (when transaction was made; > in > > seconds) and a country name (from where the transaction was made). So, > for > > instance, > > > > { > > . . . . > > "timestamp": 1332806400 > > "country_name": "Australia", > > . . . . > > } > > > > Question is: how does one get unique / distinct country names in between > > dates? For example, give me all country names in between 10-Jul-2010 and > > 21-Jan-2013. > > > > My solution was to write a custom reduce function and set > > reduce_limit=false, so that i can enumerate all countries without hitting > > the overflow exception. It works great! However, such solutions are > frowned > > upon by everyone around. Has anyone a better idea on how to tackle this > > efficiently? > > > > Andrey >
