FYI, for Linux with SSDs, changing the io scheduler to deadline or noop can make a 500x improvement. I haven't tried this myself.
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/hardware.html#_disks On Tue, Jan 20, 2015 at 9:28 AM, Chris Riccomini < criccom...@linkedin.com.invalid> wrote: > Hey Roger, > > We did some benchmarking, and discovered very similar performance to what > you've described. We saw ~40k writes/sec, and ~20 k reads/sec, > per-container, on a Virident SSD. This was without any changelog. Are you > using a changelog on the store? > > When we attached a changelog to the store, the writes dropped > significantly (~1000 writes/sec). When we hooked up VisualVM, we saw that > the container was spending > 99% of its time in KafkaSystemProducer.send(). > > We're currently doing two things: > > 1. Working with our performance team to understand and tune RocksDB > properly. > 2. Upgrading the Kafka producer to use the new Java-based API. (SAMZA-227) > > For (1), it seems like we should be able to get a lot higher throughput > from RocksDB. Anecdotally, we've heard that RocksDB requires many threads > in order to max out an SSD, and since Samza is single-threaded, we could > just be hitting a RocksDB bottleneck. We won't know until we dig into the > problem (which we started investigating last week). The current plan is to > start by benchmarking RocksDB JNI outside of Samza, and see what we can > get. From there, we'll know our "speed of light", and can try to get Samza > as close as possible to it. If RocksDB JNI can't be made to go "fast", > then we'll have to understand why. > > (2) should help with the changelog issue. I believe that the slowness with > the changelog is caused because the changelog is using a sync producer to > send to Kafka, and is blocking when a batch is flushed. In the new API, > the concept of a "sync" producer is removed. All writes are handled on an > async writer thread (though we can still guarantee writes are safely > written before checkpointing, which is what we need). > > In short, I agree, it seems slow. We see this behavior, too. We're digging > into it. > > Cheers, > Chris > > On 1/17/15 12:58 PM, "Roger Hoover" <roger.hoo...@gmail.com> wrote: > > >Michael, > > > >Thanks for the response. I used VisualVM and YourKit and see the CPU is > >not being used (0.1%). I took a few thread dumps and see the main thread > >blocked on the flush() method inside the KV store. > > > >On Sat, Jan 17, 2015 at 7:09 AM, Michael Rose <elementat...@gmail.com> > >wrote: > > > >> Is your process at 100% CPU? I suspect you're spending most of your > >>time in > >> JSON deserialization, but profile it and check. > >> > >> Michael > >> > >> On Friday, January 16, 2015, Roger Hoover <roger.hoo...@gmail.com> > >>wrote: > >> > >> > Hi guys, > >> > > >> > I'm testing a job that needs to load 40M records (6GB in Kafka as > >>JSON) > >> > from a bootstrap topic. The topic has 4 partitions and I'm running > >>the > >> job > >> > using the ProcessJobFactory so all four tasks are in one container. > >> > > >> > Using RocksDB, it's taking 19 minutes to load all the data which > >>amounts > >> to > >> > 35k records/sec or 5MB/s based on input size. I ran iostat during > >>this > >> > time as see the disk write throughput is 14MB/s. > >> > > >> > I didn't tweak any of the storage settings. > >> > > >> > A few questions: > >> > 1) Does this seem low? I'm running on a Macbook Pro with SSD. > >> > 2) Do you have any recommendations for improving the load speed? > >> > > >> > Thanks, > >> > > >> > Roger > >> > > >> > >