Thanks . What is this number 2004857600000? is it in bits or bytes? Thanks, Selva
On Tue, Mar 10, 2020 at 2:57 AM lamberken <[email protected]> wrote: > > > hi, > > > IMO, when upsert 150K record with 100columns, these records need > serializate to disk and deserialize from disk. > You can try add < option("hoodie.memory.merge.max.size", "2004857600000") > > > > best, > lamber-ken > > > > > > At 2020-03-10 17:07:58, "selvaraj periyasamy" < > [email protected]> wrote: > > Sorry for the partial emails. My company portal don’t allow me to add test > code . Am using 0.5.0 version of Hudi Jars built from my local. While > running upsert , it takes more than 6 or 7 mins for processing 150k records. > > > > Is there any tuning that could reduce the processing time from 6 or 7 mins > ? Overwrite just takes less than a min ? Each row has 100 columns . > > > > Thanks, > Selva > > > On Tue, Mar 10, 2020 at 1:51 AM selvaraj periyasamy < > [email protected]> wrote: > > Team, > > > Am using 0.5.0 version of Hudi Jars built from my local. While running > upsert , it takes more than 6 or 7 mins for processing 150k records. Below > are the code and logs. > > > 20/03/10 07:26:09 INFO IteratorBasedQueueProducer: starting to buffer > records > 20/03/10 07:26:09 INFO BoundedInMemoryExecutor: starting consumer thread > 20/03/10 07:33:59 INFO IteratorBasedQueueProducer: finished buffering > records > 20/03/10 07:34:00 INFO BoundedInMemoryExecutor: Queue Consumption is done; > notifying producer threads > > > 20/03/10 07:26:08 INFO IteratorBasedQueueProducer: starting to buffer > records > 20/03/10 07:26:08 INFO BoundedInMemoryExecutor: starting consumer thread > 20/03/10 07:33:31 INFO IteratorBasedQueueProducer: finished buffering > records > 20/03/10 07:33:31 INFO BoundedInMemoryExecutor: Queue Consumption is done; > notifying producer threads > > > While running insert > > > On Tue, Mar 10, 2020 at 1:45 AM selvaraj periyasamy < > [email protected]> wrote: > > Team, > > > Am using 0.5.0 version of Hudi Jars built from my local. While running > upsert > > > 20/03/10 07:26:09 INFO IteratorBasedQueueProducer: starting to buffer > records > 20/03/10 07:26:09 INFO BoundedInMemoryExecutor: starting consumer thread > 20/03/10 07:33:59 INFO IteratorBasedQueueProducer: finished buffering > records > 20/03/10 07:34:00 INFO BoundedInMemoryExecutor: Queue Consumption is done; > notifying producer threads > > > 20/03/10 07:26:08 INFO IteratorBasedQueueProducer: starting to buffer > records > 20/03/10 07:26:08 INFO BoundedInMemoryExecutor: starting consumer thread > 20/03/10 07:33:31 INFO IteratorBasedQueueProducer: finished buffering > records > 20/03/10 07:33:31 INFO BoundedInMemoryExecutor: Queue Consumption is done; > notifying producer threads > > > >
