Re: Pre-build Spark for Windows 8.1

2013-12-07 Thread Andrew Ash
Thanks for the info Matei. Seems like lots of other users of Akka have similar issues -- maybe at some point in the future it'll be worth making this a bit more flexible, but there are more important places to spend time right now. On Fri, Dec 6, 2013 at 12:06 PM, Matei Zaharia wrote: > Hey And

Biggest spark.akka.framesize possible

2013-12-07 Thread Matt Cheah
Hi everyone, I'm noticing like others that group-By operations with large sized groups gives Spark some trouble. Increasing the spark.akka.frameSize property alleviates it up to a point. I was wondering what the maximum setting for this value is. I've seen previous e-mails talking about the ra

Re: Writing an RDD to Hive

2013-12-07 Thread Matei Zaharia
Hi Philip, There are a few things you can do: - If you want to avoid the data copy with a CREATE TABLE statement, you can use CREATE EXTERNAL TABLE, which points to an existing file or directory. - If you always reuse the same table, you could CREATE TABLE only once and then simply place files

data storage formats

2013-12-07 Thread Ankur Chauhan
Hi all, I am wondering what do people use as the on disk storage format. I have seen almost all the examples use csv files to store and load data but that seems too simplisting for obvious reasons (compressibility to name one). I was just interested to find out what people use to store computat