Re: BlockManager issues

2014-09-22 Thread Hortonworks
Actually I met similar issue when doing groupByKey and then count if the shuffle size is big e.g. 1tb. Thanks. Zhan Zhang Sent from my iPhone On Sep 21, 2014, at 10:56 PM, Nishkam Ravi nr...@cloudera.com wrote: Thanks for the quick follow up Reynold and Patrick. Tried a run with

Re: BlockManager issues

2014-09-22 Thread Christoph Sawade
Hey all. We had also the same problem described by Nishkam almost in the same big data setting. We fixed the fetch failure by increasing the timeout for acks in the driver: set(spark.core.connection.ack.wait.timeout, 600) // 10 minutes timeout for acks between nodes Cheers, Christoph 2014-09-22

Re: BlockManager issues

2014-09-22 Thread David Rowe
I've run into this with large shuffles - I assumed that there was contention between the shuffle output files and the JVM for memory. Whenever we start getting these fetch failures, it corresponds with high load on the machines the blocks are being fetched from, and in some cases complete

FW: Spark SQL 1.1.0: NPE when join two cached table

2014-09-22 Thread Haopu Wang
FWD to dev mail list for helps From: Haopu Wang Sent: 2014年9月22日 16:35 To: u...@spark.apache.org Subject: Spark SQL 1.1.0: NPE when join two cached table I have two data sets and want to join them on each first field. Sample data are below: data set

Re: A couple questions about shared variables

2014-09-22 Thread Nan Zhu
If you think it as necessary to fix, I would like to resubmit that PR (seems to have some conflicts with the current DAGScheduler) My suggestion is to make it as an option in accumulator, e.g. some algorithms utilizing accumulator for result calculation, it needs a deterministic accumulator,

Re: A couple questions about shared variables

2014-09-22 Thread Sandy Ryza
MapReduce counters do not count duplications. In MapReduce, if a task needs to be re-run, the value of the counter from the second task overwrites the value from the first task. -Sandy On Mon, Sep 22, 2014 at 4:55 AM, Nan Zhu zhunanmcg...@gmail.com wrote: If you think it as necessary to fix,

Re: hash vs sort shuffle

2014-09-22 Thread Cody Koeninger
Unfortunately we were somewhat rushed to get things working again and did not keep the exact stacktraces, but one of the issues we saw was similar to that reported in https://issues.apache.org/jira/browse/SPARK-3032 We also saw FAILED_TO_UNCOMPRESS errors from snappy when reading the shuffle

Re: guava version conflicts

2014-09-22 Thread Gary Malouf
Hi Marcelo, Interested to hear the approach to be taken. Shading guava itself seems extreme, but that might make sense. Gary On Sat, Sep 20, 2014 at 9:38 PM, Marcelo Vanzin van...@cloudera.com wrote: Hmm, looks like the hack to maintain backwards compatibility in the Java API didn't work

spark_classpath in core/pom.xml and yarn/porm.xml

2014-09-22 Thread Ye Xianjin
Hi: I notice the scalatest-maven-plugin set SPARK_CLASSPATH environment variable for testing. But in the SparkConf.scala, this is deprecated in Spark 1.0+. So what this variable for? should we just remove this variable? -- Ye Xianjin Sent with Sparrow

Re: A couple questions about shared variables

2014-09-22 Thread Nan Zhu
I see, thanks for pointing this out -- Nan Zhu On Monday, September 22, 2014 at 12:08 PM, Sandy Ryza wrote: MapReduce counters do not count duplications. In MapReduce, if a task needs to be re-run, the value of the counter from the second task overwrites the value from the first

Re: guava version conflicts

2014-09-22 Thread Marcelo Vanzin
Hi Cody, I'm still writing a test to make sure I understood exactly what's going on here, but from looking at the stack trace, it seems like the newer Guava library is picking up the Optional class from the Spark assembly. Could you try one of the options that put the user's classpath before the

Re: Support for Hive buckets

2014-09-22 Thread Michael Armbrust
Hi Cody, There are currently no concrete plans for adding buckets to Spark SQL, but thats mostly due to lack of resources / demand for this feature. Adding full support is probably a fair amount of work since you'd have to make changes throughout parsing/optimization/execution. That said, there

Re: guava version conflicts

2014-09-22 Thread Cody Koeninger
We're using Mesos, is there a reasonable expectation that spark.files.userClassPathFirst will actually work? On Mon, Sep 22, 2014 at 1:42 PM, Marcelo Vanzin van...@cloudera.com wrote: Hi Cody, I'm still writing a test to make sure I understood exactly what's going on here, but from looking

Re: guava version conflicts

2014-09-22 Thread Marcelo Vanzin
Hmmm, a quick look at the code indicates this should work for executors, but not for the driver... (maybe this deserves a bug being filed, if there isn't one already?) If it's feasible for you, you could remove the Optional.class file from the Spark assembly you're using. On Mon, Sep 22, 2014 at

Re: guava version conflicts

2014-09-22 Thread Cody Koeninger
We've worked around it for the meantime by excluding guava from transitive dependencies in the job assembly and specifying the same version of guava 14 that spark is using. Obviously things break whenever a guava 15 / 16 feature is used at runtime, so a long term solution is needed. On Mon, Sep

Re: guava version conflicts

2014-09-22 Thread Marcelo Vanzin
FYI I filed SPARK-3647 to track the fix (some people internally have bumped into this also). On Mon, Sep 22, 2014 at 1:28 PM, Cody Koeninger c...@koeninger.org wrote: We've worked around it for the meantime by excluding guava from transitive dependencies in the job assembly and specifying the

OutOfMemoryError on parquet SnappyDecompressor

2014-09-22 Thread Cody Koeninger
After commit 8856c3d8 switched from gzip to snappy as default parquet compression codec, I'm seeing the following when trying to read parquet files saved using the new default (same schema and roughly same size as files that were previously working): java.lang.OutOfMemoryError: Direct buffer

Re: hash vs sort shuffle

2014-09-22 Thread Patrick Wendell
Hey Cody, In terms of Spark 1.1.1 - we wouldn't change a default value in a spot release. Changing this to default is slotted for 1.2.0: https://issues.apache.org/jira/browse/SPARK-3280 - Patrick On Mon, Sep 22, 2014 at 9:08 AM, Cody Koeninger c...@koeninger.org wrote: Unfortunately we were