On Fri, 2015-01-09 at 19:28 +0100, Hannes Reinecke wrote:
[...]
> > I think you are assuming we are leaving the iscsi code as it is today.
> > 
> > For the non-MCS mq session per CPU design, we would be allocating and
> > binding the session and its resources to specific CPUs. They would only
> > be accessed by the threads on that one CPU, so we get our
> > serialization/synchronization from that. That is why we are saying we
> > do not need something like atomic_t/spin_locks for the sequence number
> > handling for this type of implementation.
> > 
> Wouldn't that need to be coordinated with the networking layer?
> Doesn't it do the same thing, matching TX/RX queues to CPUs?
> If so, wouldn't we decrease bandwidth by restricting things to one CPU?

So this is actually one of the fascinating questions on multi-queue.
Long ago, when I worked for the NCR OS group and we were bringing up the
first SMP systems, we actually found that the SCSI stack went faster
when bound to a single CPU.  The problem in those days was lock
granularity and contention, so single CPU binding eliminated that
overhead.  However, nowadays with modern multi-tiered caching and huge
latencies for cache line bouncing, we're approaching the point where the
fineness of our lock granularity is hurting performance, so it's worth
re-asking the question of whether just dumping all the lock latency by
single CPU binding is a worthwhile exercise.

James

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at http://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.

Reply via email to