On Fri, Dec 18, 2015 at 9:35 PM, Cox, Jonathan A wrote:
>
> The Hadoop version is 2.6.2.
>
I'm assuming the reduce phase is failing with the OOME, is that correct?
Could you run "jps -v" to see what the full set of JVM parameters are
for the JVM that is running the task that
On Fri, Dec 18, 2015 at 4:31 PM, Riesland, Zack
wrote:
> We are able to ingest MUCH larger sets of data (hundreds of GB) using the
> CSVBulkLoadTool.
>
> However, we have found it to be a huge memory hog.
>
> We dug into the source a bit and found that
>
Hi Jonathan,
Sounds like something is very wrong here.
Are you running the job on an actual cluster, or are you using the
local job tracker (i.e. running the import job on a single computer).
Normally an import job, regardless of the size of the input, should
run with map and reduce tasks that
.
-Original Message-
From: Gabriel Reid [mailto:gabriel.r...@gmail.com]
Sent: Friday, December 18, 2015 10:17 AM
To: user@phoenix.apache.org
Subject: Re: Java Out of Memory Errors with CsvBulkLoadTool
Hi Jonathan,
Sounds like something is very wrong here.
Are you running the job on an actual
-Xmx48g
-Original Message-
From: Gabriel Reid [mailto:gabriel.r...@gmail.com]
Sent: Friday, December 18, 2015 8:17 AM
To: user@phoenix.apache.org
Subject: [EXTERNAL] Re: Java Out of Memory Errors with CsvBulkLoadTool
Hi Jonathan,
Sounds like something is very wrong here.
Are you running the job on
iel.r...@gmail.com]
> Sent: Friday, December 18, 2015 8:17 AM
> To: user@phoenix.apache.org
> Subject: [EXTERNAL] Re: Java Out of Memory Errors with CsvBulkLoadTool
>
> Hi Jonathan,
>
> Sounds like something is very wrong here.
>
> Are you running the job on an actual cluster, or
Hi Gabriel,
The Hadoop version is 2.6.2.
-Jonathan
-Original Message-
From: Gabriel Reid [mailto:gabriel.r...@gmail.com]
Sent: Friday, December 18, 2015 11:58 AM
To: user@phoenix.apache.org
Subject: Re: [EXTERNAL] Re: Java Out of Memory Errors with CsvBulkLoadTool
Hi Jonathan,
Which
I am trying to ingest a 575MB CSV file with 192,444 lines using the
CsvBulkLoadTool MapReduce job. When running this job, I find that I have to
boost the max Java heap space to 48GB (24GB fails with Java out of memory
errors).
I'm concerned about scaling issues. It seems like it shouldn't