Re: [EXTERNAL] Re: Java Out of Memory Errors with CsvBulkLoadTool

2015-12-19 Thread Gabriel Reid
On Fri, Dec 18, 2015 at 9:35 PM, Cox, Jonathan A wrote: > > The Hadoop version is 2.6.2. > I'm assuming the reduce phase is failing with the OOME, is that correct? Could you run "jps -v" to see what the full set of JVM parameters are for the JVM that is running the task that

Re: Java Out of Memory Errors with CsvBulkLoadTool

2015-12-18 Thread Gabriel Reid
On Fri, Dec 18, 2015 at 4:31 PM, Riesland, Zack wrote: > We are able to ingest MUCH larger sets of data (hundreds of GB) using the > CSVBulkLoadTool. > > However, we have found it to be a huge memory hog. > > We dug into the source a bit and found that >

Re: Java Out of Memory Errors with CsvBulkLoadTool

2015-12-18 Thread Gabriel Reid
Hi Jonathan, Sounds like something is very wrong here. Are you running the job on an actual cluster, or are you using the local job tracker (i.e. running the import job on a single computer). Normally an import job, regardless of the size of the input, should run with map and reduce tasks that

RE: Java Out of Memory Errors with CsvBulkLoadTool

2015-12-18 Thread Riesland, Zack
. -Original Message- From: Gabriel Reid [mailto:gabriel.r...@gmail.com] Sent: Friday, December 18, 2015 10:17 AM To: user@phoenix.apache.org Subject: Re: Java Out of Memory Errors with CsvBulkLoadTool Hi Jonathan, Sounds like something is very wrong here. Are you running the job on an actual

RE: [EXTERNAL] Re: Java Out of Memory Errors with CsvBulkLoadTool

2015-12-18 Thread Cox, Jonathan A
-Xmx48g -Original Message- From: Gabriel Reid [mailto:gabriel.r...@gmail.com] Sent: Friday, December 18, 2015 8:17 AM To: user@phoenix.apache.org Subject: [EXTERNAL] Re: Java Out of Memory Errors with CsvBulkLoadTool Hi Jonathan, Sounds like something is very wrong here. Are you running the job on

Re: [EXTERNAL] Re: Java Out of Memory Errors with CsvBulkLoadTool

2015-12-18 Thread Gabriel Reid
iel.r...@gmail.com] > Sent: Friday, December 18, 2015 8:17 AM > To: user@phoenix.apache.org > Subject: [EXTERNAL] Re: Java Out of Memory Errors with CsvBulkLoadTool > > Hi Jonathan, > > Sounds like something is very wrong here. > > Are you running the job on an actual cluster, or

RE: [EXTERNAL] Re: Java Out of Memory Errors with CsvBulkLoadTool

2015-12-18 Thread Cox, Jonathan A
Hi Gabriel, The Hadoop version is 2.6.2. -Jonathan -Original Message- From: Gabriel Reid [mailto:gabriel.r...@gmail.com] Sent: Friday, December 18, 2015 11:58 AM To: user@phoenix.apache.org Subject: Re: [EXTERNAL] Re: Java Out of Memory Errors with CsvBulkLoadTool Hi Jonathan, Which

Java Out of Memory Errors with CsvBulkLoadTool

2015-12-17 Thread Cox, Jonathan A
I am trying to ingest a 575MB CSV file with 192,444 lines using the CsvBulkLoadTool MapReduce job. When running this job, I find that I have to boost the max Java heap space to 48GB (24GB fails with Java out of memory errors). I'm concerned about scaling issues. It seems like it shouldn't