from:"Peter Halliday"

how to investigate skew and DataFrames and RangePartitioner

2016-06-13 Thread Peter Halliday

does one achieve this now. Peter Halliday - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Error writing parquet to S3

2016-06-10 Thread Peter Halliday

Has anyone else seen this before? Before when I saw this there was an OOM but doesn’t seem so. Of course, I’m not sure how large the file that created this was either. Peter > On Jun 9, 2016, at 9:00 PM, Peter Halliday <pjh...@cornell.edu> wrote: > > I’m not 100% sure

Error writing parquet to S3

2016-06-09 Thread Peter Halliday

I’m not 100% sure why I’m getting this. I don’t see any errors before this at all. I’m not sure how to diagnose this. Peter Halliday 2016-06-10 01:46:05,282] WARN org.apache.spark.scheduler.TaskSetManager [task-result-getter-2hread] - Lost task 3737.0 in stage 2.0 (TID 10585, ip-172-16

UnsupportedOperationException: converting from RDD to DataSets on 1.6.1

2016-06-08 Thread Peter Halliday

I have some code that was producing OOM during shuffle and was RDD. So, upon direction by a member of Databricks I started covering to Datasets. However, when we did we are getting an error that seems to be not liking something within one of our case classes. Peter Halliday [2016-06-08 19

Re: EMR Spark log4j and metrics

2016-04-15 Thread Peter Halliday

I wonder if anyone can confirm is Spark on YARN the problem here? Or is it how AWS has put it together? I'm wondering if Spark on YARN has problems with configuration files for the workers and driver? Peter Halliday On Thu, Apr 14, 2016 at 1:09 PM, Peter Halliday <pjh...@cornell.edu>

Re: EMR Spark log4j and metrics

2016-04-14 Thread Peter Halliday

see evidence than the configuration files are read from or used after they pushed On Wed, Apr 13, 2016 at 11:22 AM, Peter Halliday <pjh...@cornell.edu> wrote: > I have an existing cluster that I stand up via Docker images and > CloudFormation Templates on AWS. We are moving to

EMR Spark log4j and metrics

2016-04-13 Thread Peter Halliday

to a jar than’s sent via —jars to spark-submit. Peter Halliday - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

FileAlreadyExistsException and Streaming context

2016-03-08 Thread Peter Halliday

the stack trace: http://pastebin.com/AqBFXkga <http://pastebin.com/AqBFXkga> Peter Halliday

Re: Get rid of FileAlreadyExistsError

2016-03-01 Thread Peter Halliday

gt; Have you tried spark.hadoop.validateOutputSpecs? > > On 01-Mar-2016 9:43 pm, "Peter Halliday" <pjh...@cornell.edu > <mailto:pjh...@cornell.edu>> wrote: > http://pastebin.com/vbbFzyzb <http://pastebin.com/vbbFzyzb> > > The problem seems to be t

Re: Get rid of FileAlreadyExistsError

2016-03-01 Thread Peter Halliday

, but no plans on changing this. I’m surprised not to see this fixed yet. Peter Halliday > On Mar 1, 2016, at 10:01 AM, Ted Yu <yuzhih...@gmail.com> wrote: > > Do you mind pastebin'ning the stack trace with the error so that we know > which part of the code is under discus

Get rid of FileAlreadyExistsError

2016-03-01 Thread Peter Halliday

the 1.5.1 version of this code doesn’t allow for this to be passed in. Is that correct? Peter Halliday - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

how to investigate skew and DataFrames and RangePartitioner

Re: Error writing parquet to S3

Error writing parquet to S3

UnsupportedOperationException: converting from RDD to DataSets on 1.6.1

Re: EMR Spark log4j and metrics

Re: EMR Spark log4j and metrics

EMR Spark log4j and metrics

FileAlreadyExistsException and Streaming context

Re: Get rid of FileAlreadyExistsError

Re: Get rid of FileAlreadyExistsError

Get rid of FileAlreadyExistsError

11 matches

Site Navigation

Mail list logo

Footer information