Re: Incremental loading data slows performance

2014-11-20 Thread Gordon Benjamin
To follow on: I asked the developer how we incrementally load data and the response was no. union only for updated records (every night) For every minutes export algorithm next: 1. upload file to hadoop. 2. load data inpath... overwrite into table _incremental; 3. insert into table ..._cached

Incremental loading data slows performance

2014-11-20 Thread Gordon Benjamin
Hi, We are seeing bad performance as we incrementally load data. Here is the config Spark standalone cluster spark01 (spark master, shark, hadoop namenode): 15GB RAM, 4vCPU's spark02 (spark worker, hadoop datanode): 15GB RAM, 8vCPU's spark03 (spark worker): 15GB RAM, 8vCPU's spark04 (spark worke