Hi all,
Is there a method for reading from s3 without having to hard-code keys? The
only 2 ways I've found both require this:
1. Set conf in code e.g.:
sc.hadoopConfiguration().set("fs.s3.awsAccessKeyId", "")
sc.hadoopConfiguration().set("fs.s3.awsSecretAccessKey",
"")
2. Set keys in URL, e.g.:
more on the use case? It looks a little bit
> like an abuse of Spark in general . Interactive queries that are not
> suitable for in-memory batch processing might be better supported by ignite
> that has in-memory indexes, concept of hot, warm, cold data etc. or hive on
> tez+ll
Hi all,
What's the best way to run ad-hoc queries against a cached RDDs?
For example, say I have an RDD that has been processed, and persisted to
memory-only. I want to be able to run a count (actually
"countApproxDistinct") after filtering by an, at compile time, unknown
(specified by query) val