Michael, Thanks for the response. I used VisualVM and YourKit and see the CPU is not being used (0.1%). I took a few thread dumps and see the main thread blocked on the flush() method inside the KV store.
On Sat, Jan 17, 2015 at 7:09 AM, Michael Rose <[email protected]> wrote: > Is your process at 100% CPU? I suspect you're spending most of your time in > JSON deserialization, but profile it and check. > > Michael > > On Friday, January 16, 2015, Roger Hoover <[email protected]> wrote: > > > Hi guys, > > > > I'm testing a job that needs to load 40M records (6GB in Kafka as JSON) > > from a bootstrap topic. The topic has 4 partitions and I'm running the > job > > using the ProcessJobFactory so all four tasks are in one container. > > > > Using RocksDB, it's taking 19 minutes to load all the data which amounts > to > > 35k records/sec or 5MB/s based on input size. I ran iostat during this > > time as see the disk write throughput is 14MB/s. > > > > I didn't tweak any of the storage settings. > > > > A few questions: > > 1) Does this seem low? I'm running on a Macbook Pro with SSD. > > 2) Do you have any recommendations for improving the load speed? > > > > Thanks, > > > > Roger > > >
