RE: [PATCH 06/17] block: introduce GENHD_FL_HIDDEN

2017-10-29 Thread Anish Jhaveri
On Saturday, October 28, 2017 3:10 AM,  Guan Junxiong wrote:

>Does it mean some extra works such as:
>1) showing the path topology of nvme multipath device
Isn't connection to the target suppose guide the host to the shortest and 
fastest path available path or so called most optimized path. Sysfs entry can 
easily store that as we store path related info over there. 

>2) daemon to implement immediate and delayed failback
This is also based on target, whenever target is ready for immediate or delayed 
failback it will let the connect from host succeed. Until then host is in 
reconnecting state across all path until it finds an optimized or un-optimized 
path. Why is this extra daemon needed ?
 
>4) grouping paths besides ANA
Why we cannot use NS Group here.

Would be good idea to move away from legacy way of doing things for NVME 
devices. NVME Multipath implementation by Christoph is very simple. How about 
not making it super complicated.  

Best regards,
Anish



Re: [PATCH 00/10] nvme multipath support on top of nvme-4.13 branch

2017-09-18 Thread Anish Jhaveri
On Fri, Sep 15, 2017 at 08:07:01PM +0200, Christoph Hellwig wrote:
> Hi Anish,
> 
> I looked over the code a bit, and I'm rather confused by the newly
> added commands.  Which controller supports them?  Also the NVMe
> working group went down a very different way with the ALUA approch,
> which uses different grouping concepts and doesn't require path
> activations - for Linux we'd really like to stick to only standardized
> multipath implementations.
Hi Christoph, 
Thanks for looking over the code. The newly added commands were to
support Asymmetric NVME Namespace Subsystem. Though, I can remove
those commands and make it active-active configuration. I noticed
you have sent out the review with changes adding new structures
and functions which makes it very similar in terms of implementation.
The only reason I proposed given changes, as it doesn't require any 
change to kernel. 



Re: [PATCH 00/10] nvme multipath support on top of nvme-4.13 branch

2017-09-18 Thread anish . jhaveri
On Wed, Sep 13, 2017 at 08:57:13AM +0200, Hannes Reinecke wrote:
> In general I am _not_ in favour of this approach.
> 
> This is essentially the same level of multipath support we had in the
> old qlogic and lpfc drivers in 2.4/2.6 series, and it took us _years_ to
> get rid of this.
> Main objection here is that it will be really hard (if not impossible)
> to use the same approach for other subsystems (eg SCSI), so we'll end up
> having different multipath implementations depending on which subsystem
> is being used.
> Which really is a maintenance nightmare.
> 
> I'm not averse to having other multipath implementations in the kernel,
> but it really should be abstracted so that other subsystems can _try_ to
> leverage it.

Got it. Thanks! 


Re: [PATCH 01/10] Initial multipath implementation.

2017-09-12 Thread Anish Jhaveri
On Tue, Sep 12, 2017 at 12:00:44PM -0400, Keith Busch wrote:
> 
> I find this patch series confusing to review. You declare these failover
> functions in patch 1, use them in patch 2, but they're not defined until
> patch 7.
Sorry for late reply. 

Idea was to keep header file changes as separate patch. 
I will move the function declaration to patch 7 and 
rearrange the patch series. If you or anyone else finds 
something which could help in browsing the changes, I
will try to incorporate next patchset.


Re: [PATCH 02/10] RDMA timeout triggers failover.

2017-09-12 Thread Anish Jhaveri

On Tue, Sep 12, 2017 at 08:48:58AM -0700, James Smart wrote:
> I don't know this is a good idea - just because there's a controller reset
> we need to failover a path ? Also putting failover smarts in the transport
> doesn't seem like a great idea. Also, there's more than just an rdma
> transport
> 
> -- james
>
Sorry for late reply. Configuring email client took a while.

Trigger failover from controller reset path can be removed, as
keep_alive timer failure would trigger failover.

What if target controller resets and comes back up within timeframe
that keep-alive timer gets serviced. This will lead to aborting of 
all IO request after multiple retries and retries completed within
same time frame before another keep_alive would get serviced. 
This will not trigger fail over.