Re: Refreshing a persisted RDD

2017-05-19 Thread Sudhir Menon
Part of the problem here is that the static dataframe is designed to be used a read only abstraction in Spark, and updating that requires the user to drop the dataframe holding the reference data and recreate it. And in order for the join to use the recreated dataframe, the query has to be

Re: RE: Fast write datastore...

2017-03-16 Thread Sudhir Menon
I am extremely leery about pushing product on this forum and have refrained from it in the past. But since you are talking about loading parquet data into Spark, run some aggregate queries and then write the results to a fast data store, and specifically asking for product options, it makes