Re: CSV bulk loading using map reduce

2014-12-23 Thread Gabriel Reid
Hi Noam, I think that the things that most typically can affect MR loading performance are: * number of regions (as this affects the number of reducers used to create the HFiles) * amount of memory used for sort buffers * use of compression on map output With your 32-region salted table, it

RE: CSV bulk loading using map reduce

2014-12-23 Thread Bulvik, Noam
Thanks for the answer we will look in to it and update The impala is impala parquet table -Original Message- From: Gabriel Reid [mailto:gabriel.r...@gmail.com] Sent: Tuesday, December 23, 2014 2:27 PM To: user@phoenix.apache.org Subject: Re: CSV bulk loading using map reduce Hi Noam, I

Re: CSV bulk loading using map reduce

2014-12-23 Thread Gabriel Reid
for the answer we will look in to it and update The impala is impala parquet table -Original Message- From: Gabriel Reid [mailto:gabriel.r...@gmail.com] Sent: Tuesday, December 23, 2014 2:27 PM To: user@phoenix.apache.org Subject: Re: CSV bulk loading using map reduce Hi Noam, I think