Thank you! Samarth solution looks like it'll work for me. One question : you mentioned that the Phoenix client keeps uncommitted rows in memory until they're sent over to HBase. When we call conn.commit() does that send the rows over to HBase immediately?
--Jeremy On Wed, Aug 19, 2015 at 7:19 PM, ALEX K <[email protected]> wrote: > I'm using the same solution as Samarth suggested (commit batching), it > brings down latency per single row upsert from 50ms to 5ms (averaged after > batching) > > On Wed, Aug 19, 2015 at 7:11 PM, Samarth Jain <[email protected]> > wrote: > >> You can do this via phoenix by doing something like this: >> >> try (Connection conn = DriverManager.getConnection(url)) { >> conn.setAutoCommit(false); >> int batchSize = 0; >> int commitSize = 1000; // number of rows you want to commit per batch. >> Change this value according to your needs. >> while (there are records to upsert) { >> stmt.executeUpdate(); >> batchSize++; >> if (batchSize % commitSize == 0) { >> conn.commit(); >> } >> } >> conn.commit(); // commit the last batch of records >> >> You don't want commitSize to be too large since Phoenix client keeps the >> uncommitted rows in memory till they are sent over to HBase. >> >> >> >> On Wed, Aug 19, 2015 at 3:05 PM, Serega Sheypak <[email protected] >> > wrote: >> >>> I would suggest you to use >>> >>> https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/BufferedMutator.html >>> instead of list of puts and share mutableBuffer across threads (it's >>> thread-safe). I reduced my response time from 30-40 ms to 4ms while using >>> buffferedmutator. It also sends mutations in async mode. :) >>> >>> I meet the same problem. Can't force Phoenix to buffer upserts on >>> client-side and then send them to HBase in small batches. >>> >>> 2015-08-19 19:40 GMT+02:00 jeremy p <[email protected]>: >>> >>>> Hello all, >>>> >>>> I need to do true batch updates to a Phoenix table. By this, I mean >>>> sending a bunch of updates to HBase as part of a single request. The HBase >>>> API offers this behavior with the Table.put(List<Put> puts) method. I >>>> noticed PhoenixStatement exposes an executeBatch() method, however, this >>>> method just executes the batched statements one-by-one. This will not >>>> deliver the performance that the HBase API exposes through their batch put >>>> method. >>>> >>>> What is the best way for me to do true batch updates to a Phoenix >>>> table? I need to do this programmatically, so I cannot use the command >>>> line bulk insert utility. >>>> >>>> --Jeremy >>>> >>> >>> >> >
