[Nbd] Earn Benefits worth Rs 20000 with your Credit Card

2016-04-13 Thread RBL Bank
-->

Copyright � 2015 - RBL Bank Ltd. | 
http://panela.pick2offers.com/ltrack?id=cAlFWVQBFUxBWxgBFEVFWVAFFwk=R1lXBR1EQRUKU0BYEgwKVFYUGSkIWFcBBkcXXlEHFgwCXlYSEEcKVFA==14772=

 


##RBL Bank has been rated India’s Best Bank (Growth) in the
mid-sized banks category for 3 consecutive years - 2012, 2013 & 2014 -
Business Today - KPMG study.

The term “RBL Bank” or “the Bank” shall mean RBL Bank Limited
(Formerly: The Ratnakar Bank Limited).

--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


[Nbd] Get a free credit Health check online

2016-04-13 Thread CreditMantri
Get your Free Credit Score & Analysis Online

Unlock your Credit Potential - Much more than a free credit score

We help you: 

Get offers from over 20+ lenders willing to lend to your credit
profile
Improve your Credit Score and resolve issues online
Reduce EMI costs on existing borrowings

First time borrowers – we give you tailor made loan/credit card
offers which also help you build your Credit history 

Click Here to Get Your Credit Score for Free


Regards,

The CreditMantri Team

--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


[Nbd] Get your Dream Home Now !

2016-04-13 Thread Dream Home
Yes!! It’s True, Now Get Home Loan

e-Approval Instantly

*Minimum Paper Work Required

Apply Online Now 


--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


[Nbd] [PATCH v3 0/2] block size extension

2016-04-13 Thread Eric Blake
Applies on top of Alex's v7 SHOULD/MUST/MAY patch.

In v3:

Rework NBD_OPT_GO reply (yet again), where the first patch
introduces the framework of NBD_REP_INFO/NBD_REP_EXPORT and
the second extends the framework with NBD_INFO_BLOCK_SIZE.

Add NBD_OPT_BLOCK_SIZE for the client to advertise that it will
obey alignment constraints.

Document a default minimum block size of 1, and default minimum
block size of 4096.

Tweak some wordings based on review.

Eric Blake (2):
  doc: Use dedicated reply types for NBD_OPT_INFO/GO
  doc: Add details on block sizes

 doc/proto.md | 290 ++-
 1 file changed, 230 insertions(+), 60 deletions(-)

-- 
2.5.5


--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


[Nbd] [PATCH v3 2/2] doc: Add details on block sizes

2016-04-13 Thread Eric Blake
Existing NBD servers often have limitations, such as requiring
actions to be aligned to block sizes or limiting maximum
transactions to avoid denial of service attacks; for example,
qemu's NBD server refuses any transaction larger than 32M.  But
to date, clients have to learn these limitations via out-of-band
means, and nothing in the spec allowed for alignment limitations.

Add a section to the document describing overall block size
constraints, and rules for what defaults to use if there is no
communication (whether out of band, or by the new options added
here).

Also, add a new client option NBD_OPT_BLOCK_SIZE (a promise that
the client will obey any advertised block sizes, to let a server
optimize to use O_DIRECT without worrying about how it would have
to report errors), and extend NBD_REP_INFO (to allow the server
to advertise block sizes in band, for a new enough client that
uses NBD_OPT_GO).

Design decision: a client that wants to learn block sizes MUST
use NBD_OPT_GO, rather than the old NBD_OPT_EXPORT_NAME, even
though we could have repurposed some of the reserved zeroes when
NBD_FLAG_C_NO_ZEROES is not in effect, because we don't want to
encourage any further abuse of NBD_OPT_EXPORT_NAME.

Signed-off-by: Eric Blake 
---
 doc/proto.md | 183 ---
 1 file changed, 161 insertions(+), 22 deletions(-)

diff --git a/doc/proto.md b/doc/proto.md
index 8d1bd5e..58295b0 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -590,6 +590,85 @@ This functionality has not yet been implemented by the 
reference
 implementation, but was implemented by qemu and subsequently
 by other users, so has been moved out of the "experimental" section.

+## Block sizes
+
+During transmission phase, several operations are constrained by the
+export size sent in reply to `NBD_OPT_EXPORT_NAME` or `NBD_OPT_GO`, as
+well as by three block sizes defined here (minimum, preferred, and
+maximum).  During the handshake phase, a client MAY announce its
+intention to honor server block sizes via the experimental
+`BLOCK_SIZE` extension; see below.  A client that is worried about
+block size constraints SHOULD use `NBD_OPT_GO` rather than
+`NBD_OPT_EXPORT_NAME`.  A server SHOULD advertise the block size
+contraints during handshake phase via the experimental `INFO`
+extension; see below.  A server and client MAY agree on block sizes
+via out of band means.  Since a server cannot advertise block sizes
+for clients that use `NBD_OPT_EXPORT_NAME`, a server that wants to
+enforce block sizes other than the defaults specified here MUST
+support the experimental `INFO` extension, and MAY refuse to connect
+to a client that does not use `NBD_OPT_GO`, but SHOULD NOT refuse to
+connect to a client that does not use `NBD_OPT_BLOCK_SIZE`.
+
+If block sizes have not been advertised or agreed on externally, then
+a client SHOULD assume a default minimum block size of 1, a preferred
+block size of 4,096, and no inherent maximum block size; but MAY
+choose to operate as if other block sizes had been specified (for
+example, by using a minimum block size of 512, a preferred block size
+of 65,536, and a maximum block size of 3,355,442 (32M)).  A server
+that does not advertise block sizes (including via external agreement)
+SHOULD NOT reject operations due to alignment constraints (that is, it
+SHOULD treat the minimum block size as 1), and SHOULD have no inherent
+maximum block size, although it MAY still terminate the connection on
+any request where the length is large enough to be deemed a denial of
+service attack. A server that does advertise block sizes MUST be
+prepared for clients that ignore the advertisement, by, at the very
+least, sending appropriate errors, and MUST NOT corrupt data in that
+case.
+
+The minimum block size represents the smallest addressable length and
+alignment within the export, although writing to an area that small
+may require the server to use a less-efficient read-modify-write
+action.  If advertised, this value MUST be a power of 2, MUST NOT be
+larger than 65,536, and MAY be as small as 1 for an export backed by a
+regular file, although the values of 512 or 4,096 are more typical for
+an export backed by a block device.  If a server advertises a minimum
+block size, the advertised export size MUST be an integer multiple of
+that block size.
+
+The preferred block size represents the minimum size at which aligned
+requests will have efficient I/O, avoiding behaviour such as
+read-modify-write.  If advertised, this MUST be a power of 2 at least
+as large as the smaller of the minimum block size and 4,096, although
+larger values (such as the minimum granularity of a hole) are also
+appropriate.  The preferred block size MAY be larger than the export
+size, in which case the client is unable to utilize the preferred
+block size for that export.
+
+The maximum block size represents the maximum length that the server
+is willing to handle in one 

[Nbd] [PATCH v3 1/2] doc: Use dedicated reply types for NBD_OPT_INFO/GO

2016-04-13 Thread Eric Blake
Since NBD_OPT_INFO and NBD_OPT_GO are experimental, we still have
a chance to fix them up before promoting them to stable.

Attempting to reuse NBD_OPT_SERVER as the reply to NBD_OPT_INFO and
NBD_OPT_GO has a few problems: clients must be prepared to parse
two different styles of the reply, based on which option request
the reply is answering.  Extending the information to provide even
more details, like block sizing, is awkward (the only way to do it
within a single reply is to have multiple length fields that must
all be consistent; and pre-computing the overall header length may
be difficult).  And requiring the server to parrot back the export
name is wasteful if the client's name is already in canonical
form.

Solve this by instead making the valid response be a series of reply
messages (similar to how NBD_OPT_LIST has a series).  The series
is always ended by the new NBD_REP_EXPORT, which is basically a
structured header pasted in front of the original response to
NBD_OPT_EXPORT_NAME.  But prior to that terminal message, the
server can now send as many additional optional items of
information as it wants (clients must ignore the ones they don't
recognize).  This patch starts with a single piece of information:
an alternate name for the export.  A future patch will then add
another piece of information, for advertising server block sizes.

Additionally:

- The wording was a bit repetitive; say the same thing in fewer
sentences.

- Swap paragraph ordering so that NBD_REP_INFO/EXPORT details aren't
split by a side-note about NBD_REP_ERR_UNSUP.

Signed-off-by: Eric Blake 
---
 doc/proto.md | 107 ++-
 1 file changed, 69 insertions(+), 38 deletions(-)

diff --git a/doc/proto.md b/doc/proto.md
index fca85e2..8d1bd5e 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -753,11 +753,12 @@ during option haggling in the fixed newstyle negotiation.

 * `NBD_REP_SERVER` (2)

-A description of an export. Data:
+A description of an export name. Data:

 - 32 bits, length of name (unsigned); MUST be no larger than the
   reply packet header length - 4
 - String, name of the export, as expected by `NBD_OPT_EXPORT_NAME`
+  or `NBD_OPT_INFO`
 - If length of name < (reply packet header length - 4), then the
   rest of the data contains some implementation-specific details
   about the export. This is not currently implemented, but future
@@ -766,9 +767,13 @@ during option haggling in the fixed newstyle negotiation.
   particular client request, this field is defined to be a string
   suitable for direct display to a human being.

-The experimental `INFO` extension (see below) adds two client
-option requests where the extra data has a definition other than a
-text string.
+* `NBD_REP_INFO` (3)
+
+Defined by the experimental `INFO` extension; see below.
+
+* `NBD_REP_EXPORT` (4)
+
+Defined by the experimental `INFO` extension; see below.

 There are a number of error reply types, all of which are denoted by
 having bit 31 set. All error replies MAY have some data set, in which
@@ -1000,13 +1005,13 @@ any structured message). This is a result of a 
(misguided) attempt to
 keep backwards compatibility with non-fixed newstyle negotiation.

 To remedy this, an `INFO` extension is envisioned. This extension adds
-two option requests and one error reply type, and extends one existing
-option reply type.
+two option requests, two option reply types, and one error reply type.

-Both options have identical formats for requests and replies. The
-only difference is that after a successful reply to `NBD_OPT_GO`
-(i.e. an `NBD_REP_SERVER`), transmission mode is entered immediately.
-Therefore these commands share common documentation.
+Both options have identical formats for requests and replies. The only
+difference is that after a successful reply to `NBD_OPT_GO` (i.e. zero
+or more `NBD_REP_INFO` then an `NBD_REP_EXPORT`), transmission mode is
+entered immediately.  Therefore these commands share common
+documentation.

 * `NBD_OPT_INFO` and `NBD_OPT_GO`

@@ -1038,52 +1043,78 @@ Therefore these commands share common documentation.
   server.
 - `NBD_REP_ERR_TLS_REQD`: The server does not wish to export this
   block device unless the client initiates TLS first.
-- `NBD_REP_SERVER`: The server accepts the chosen export.
+- A series of zero or more `NBD_REP_INFO`, followed by a terminating
+  `NBD_REP_EXPORT`: The server accepts the chosen export.

 Additionally, if TLS has not been initiated, the server MAY reply
 with `NBD_REP_ERR_TLS_REQD` (instead of `NBD_REP_ERR_UNKNOWN`)
 to requests for exports that are unknown. This is so that clients
 that have not initiated TLS cannot enumerate exports.

-In the case of `NBD_REP_SERVER`, the message's data takes on a different
-interpretation than the default (so as to provide additional
-binary information 

Re: [Nbd] [PATCHv7] docs/proto.md: Clarify SHOULD / MUST / MAY etc

2016-04-13 Thread Eric Blake
On 04/12/2016 12:35 PM, Alex Bligh wrote:
> These are changes which possibly have semantic effect
> 
> * Clarify that SHOULD / MUST / MAY etc. when in capitals have an
>   RFC 2119 meaning using the wording within that RFC.
> 
> * Fix some lowercase use of these words which actually were
>   meant to be uppercase.
> 
> * Fix some lowercase 'should' which clearly meant 'MUST'; where
>   it's not obvious, I've made them 'SHOULD' or left them as is.
> 
> * Fix wording on transmission flags to be clearer.
> 
> Signed-off-by: Alex Bligh 
> ---
>  doc/proto.md | 100 
> +++
>  1 file changed, 59 insertions(+), 41 deletions(-)

Reviewed-by: Eric Blake 

Is there anything still holding this up, or can we get it applied? It's
acting as a conflict magnet for other patches that are still under
discussion.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


Re: [Nbd] [PATCH] Docs: improve description of disconnection methods

2016-04-13 Thread Alex Bligh

On 13 Apr 2016, at 17:09, Alex Bligh  wrote:

> Here's what
> https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt
> has to say:

Aha. I found the original thread where I asked about this
(it was on linux-fsdevel):

http://www.spinics.net/lists/linux-fsdevel/msg45584.html

Specifically this from Tejun Heo who I believe is/was a block
layer / fs layer maintainer:

http://www.spinics.net/lists/linux-fsdevel/msg45616.html
> On Wed, May 25, 2011 at 5:54 PM, Alex Bligh  wrote:
> > a) If I do not complete a write command, I may avoid writing it to disk
> >  indefinitely (despite completing subsequently received FLUSH
> >  commands). The only flushes to disk that I am obliged to flush
> >  are those that I've actually told the block layer that I have done.
> 
> Yes, driver doesn't have any ordering responsibility w.r.t. FLUSH for
> writes which it hasn't declared finished yet.

> > b) If I receive a flush command, and prior to completing that flush
> >  command, I receive subsequent write commands, I may execute
> >  (and, if I like, write, to disk) write commands received AFTER that
> >  flush command. I presume if the subsequent write commands write to
> >  blocks that I am meant to be flushing, I can just forget about
> >  the blocks I am meant to be flushing (because they would be
> >  overwritten) provided *something* overwritten what was there before.
> 
> The first half is correct.  The latter half may be correct if there's
> no intervening write but _please_ don't do that.  If there's something
> to be optimized there, it should be done in upper layers.  It's
> playing with fire.



--
Alex Bligh






signature.asc
Description: Message signed with OpenPGP using GPGMail
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


Re: [Nbd] [PATCH] Docs: improve description of disconnection methods

2016-04-13 Thread Alex Bligh

On 13 Apr 2016, at 16:39, Eric Blake  wrote:

>> 
>> I wouldn't want to loose that. So if the client sends NBD_CMD_DISC
>> without waiting for all his inflight commands to complete, those
>> inflight commands may not be executed at all, because the server
>> is free to process commands in any order. It's going to make
>> server design very awkward if you can only process /some/ commands
>> out of order.
> 
> We already have that constraint - commands with NBD_CMD_FLAG_FUA must be
> processed in a particular order,

No, that is not true. Commands with NBD_CMD_FUA set may be
processed in any order with respect to other commands. They just
MUST NOT reply until the data within them is written to the disk.
This is exactly per the linux block layer.

Here's what the spec says.

 "A server MUST NOT reply to a command that has NBD_CMD_FLAG_FUA set
 in its command flags until the data (if any) written by that command
 is persisted to non-volatile storage."

It is perfectly possible to reorder a FUA write to run before
a another write that is issued first, or behind another write
that is issued after.

FUA is completely independent of ordering.

> and NBD_CMD_FLUSH must be processed in
> a particular order.

No, that's not true either (but you're closer).

Here's what the spec says (again this is the same as the Linux block
layer:

"All write commands (that includes NBD_CMD_WRITE, NBD_WRITE_ZEROES
and NBD_CMD_TRIM) that the server completes (i.e. replies to) prior
to processing to a NBD_CMD_FLUSH MUST be written to non-volatile
storage prior to replying to that NBD_CMD_FLUSH"

That is NOT imposing an ordering constraint on processing commands.
It's saying that when you reply to NBD_CMD_FLUSH all the writes
that have been *replied to* must be completed. So you are at
liberty to process the flush before all your pending writes and
not flush them if you want. You can thus reorder your flushes
as much as you like; if the client doesn't want them reordered
it should not issue the flush until after the writes have
replied.

Strange but true, but this is how the Linux block layer works.

(BTW this was precisely the point I was trying to clarify
earlier on, and I had an additional 'SHOULD' behaviour to
advise against this, and people wanted it out).

Here's what
https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt
has to say:

> Explicit cache flushes
> --
> 
> The REQ_FLUSH flag can be OR ed into the r/w flags of a bio submitted from
> the filesystem and will make sure the volatile cache of the storage device
> has been flushed before the actual I/O operation is started.  This explicitly
> guarantees that previously completed write requests are on non-volatile
> storage before the flagged bio starts. In addition the REQ_FLUSH flag can be
> set on an otherwise empty bio structure, which causes only an explicit cache
> flush without any dependent I/O.  It is recommend to use
> the blkdev_issue_flush() helper for a pure cache flush.

Note that it only talks about COMPLETED writes.

There is no restriction on reordering writes.

> Requiring NBD_CMD_DISC to be processed last is not
> much different than these other two situations (well, different in that
> the other two only have to guarantee that commands _with replies_ have
> hit permanent storage, not ALL commands received).

Seems to me very different as I don't think those ordering constraints
you mention actually exist! Certainly nbd-server.c does not implement
them and neither does gonbdserver.

But:

>> Another alternative would be to make the server
>> wait for all commands to complete before acting on the disconnect
>> (as opposed to or in addition to making the client wait to send
>> it). I'm reasonably relaxed about which one we do, but I think
>> we should do one or the other (or at least say that if the
>> client sends NBD_CMD_DISC without waiting for commands to complete
>> then those commands must not be executed). There are thus
>> various choices for NBD_CMD_DISC.
> 
> I think it is perfectly fair to put the requirement on the server that
> it MUST wait until all inflight commands have been responded to before
> disconnecting;

I suspect we'll end up with that.

> and at the same time that a client SHOULD wait until
> there are no inflight commands before sending NBD_CMD_DISC.

I like that. I think Wouter doesn't.

>> I think the option haggling phase is different (or rather need
>> not be the same). Fundamentally options MUST be processed in
>> the order they are issued,
> 
> No, we already explicitly state that options may be replied to
> out-of-order, and that the burden is on the client to wait for
> particular replies before sending another option of the same type.

Yeah so Wouter pointed out :-(

>> and there is only ever one in
>> flight at a time.
> 
> No, a client can batch send a bunch of options before waiting for any
> replies.

Yeah so Wouter pointed out :-(

>>> 
>>> It might be good 

Re: [Nbd] [PATCH v2] doc: Add new NBD_REP_INFO reply, for advertising block size

2016-04-13 Thread Alex Bligh

On 13 Apr 2016, at 16:26, Eric Blake  wrote:

>> Having thought a bit more about this, I think we might (after all)
>> need a client flag which says "I respect minimum block sizes"
>> or "I respect block sizes" very early on in the negotiation.
>> 
>> The reason why is this.
>> 
>> Let's suppose I have a file backed NBD server. I'd really like
>> to open my files with O_DIRECT in order to gain performance, but
>> to do so I need to (a) advertise a minimum block size of 4096,
>> and (b) (crucially) know the client will respect that. If
>> my client doesn't tell me that, I'd open without O_DIRECT.
>> 
>> Thoughts?
> 
> Is it plausible that block sizes may differ per-export, or is it more
> likely to always be a global property of the server?  In other words,
> should we have a dedicated NBD_OPT_BLOCK_SIZE issued by the client which
> returns the sizes used globally by the server, rather than having to
> advertise (possibly-separate) sizing per export?

It's not only possible, it's certain. My server (for instance)
supports a Ceph backend and a file backend, and they have different
minimum block sizes. In the future it will also support O_DIRECT
file, and that will have another block size.

I suspect nbd-server.c might be like with one of the weird cow
or treefile settings that I've never understood :-)

However, I think it's reasonable to say a *client* either understands
the concept of minimum block sizes, or does not (which is what I
was getting at).

--
Alex Bligh






signature.asc
Description: Message signed with OpenPGP using GPGMail
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


Re: [Nbd] [PATCH v2] doc: Add new NBD_REP_INFO reply, for advertising block size

2016-04-13 Thread Alex Bligh

On 13 Apr 2016, at 16:24, Eric Blake  wrote:

>>> 
>>> +* `NBD_INFO_MAXIMUM_BLOCK` (4)
>>> +
>>> +  The *info length* MUST be 4, and represents the maximum block
>>> +  size.  See the "Block sizes" section for further requirements on
>>> +  its value.
>>> +
>>> +  - 32 bits, maximum block size
>> 
>> I like these.
> 
> Is it worth keeping the three separate (where the server can advertise
> one, but not all three, and the others fall back to defaults), or is it
> easier to just always require all three to be present (since we've
> documented sane defaults, a server should always be able to supply
> something)?

I think having them all in one option is easier. I guess we could
permit '0' to mean 'the default' but TBH I'd prefer if you specified
one of them you specify all of them, which would also avoid the need
for the rather convoluted language I tried to suggest about the
need to ensure that if you only specify some of them those ones
need to be compatible with the default.

--
Alex Bligh






signature.asc
Description: Message signed with OpenPGP using GPGMail
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


Re: [Nbd] [PATCH v2] doc: Add new NBD_REP_INFO reply, for advertising block size

2016-04-13 Thread Alex Bligh

On 13 Apr 2016, at 15:51, Eric Blake  wrote:

> On 04/13/2016 05:41 AM, Alex Bligh wrote:
>> 
>> On 13 Apr 2016, at 12:05, Alex Bligh  wrote:
>> 
 Having a default for preferred block size sounds sane, although it might
 be better to switch it to 4096 (which is what most conversations seem to
 use today) rather than 512.
>>> 
>>> +1
>> 
>> Actually doubly +1, as this in theory allows for O_DIRECT type optimisation
>> with 4k page sizes.
> 
> Telling clients to use a default preferred size of 64k (rather than 4k)
> is still nice for O_DIRECT optimizations.

Sure. Default preferred size a *minimum* of 4k. I think Wouter is
saying it shouldn't be 512.

--
Alex Bligh






signature.asc
Description: Message signed with OpenPGP using GPGMail
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


Re: [Nbd] [PATCH v2] doc: Add new NBD_REP_INFO reply, for advertising block size

2016-04-13 Thread Eric Blake
On 04/13/2016 05:41 AM, Alex Bligh wrote:
> 
> On 13 Apr 2016, at 12:05, Alex Bligh  wrote:
> 
>>> Having a default for preferred block size sounds sane, although it might
>>> be better to switch it to 4096 (which is what most conversations seem to
>>> use today) rather than 512.
>>
>> +1
> 
> Actually doubly +1, as this in theory allows for O_DIRECT type optimisation
> with 4k page sizes.

Telling clients to use a default preferred size of 64k (rather than 4k)
is still nice for O_DIRECT optimizations.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


Re: [Nbd] [PATCH v2] doc: Add new NBD_REP_INFO reply, for advertising block size

2016-04-13 Thread Eric Blake
On 04/13/2016 01:27 AM, Wouter Verhelst wrote:
> Eric,
> 
> On Tue, Apr 12, 2016 at 06:16:54PM -0600, Eric Blake wrote:
>> Existing NBD servers often have limitations, such as requiring
>> actions to be aligned to block sizes or limiting maximum
>> transactions to avoid denial of service attacks; for example,
>> qemu's NBD server refuses any transaction larger than 32M.  But
>> to date, clients have to learn these limitations via out-of-band
>> means.
>>
>> This alters NBD_OPT_INFO and NBD_OPT_GO to use a new reply type
>> NBD_REP_INFO (rather than overloading NBD_REP_SERVER), so that
>> we have a future-proof way of supplying as much additional
>> structured information about an export as we want.
>>
>> Design decision: a client that wants to learn block sizes MUST
>> use NBD_OPT_GO, rather than the old NBD_OPT_EXPORT_NAME, even
>> though we could have repurposed some of the reserved zeroes when
>> NBD_FLAG_C_NO_ZEROES is not in effect, because we don't want to
>> encourage any further abuse of NBD_OPT_EXPORT_NAME.
>>
>> Design decision: no new global NBD_FLAG or NBD_OPT are required;
>> there is nothing for the client to negotiate.  The server merely
>> provides as much information as it can, and the client then
>> interprets what information it understands.  The items are
>> structured so that a client can ignore details from the server
>> that the client does not know about, and that we can easily add
>> future items of information.
> 
> General note (in-depth review may follow later):
> 
> Currently, there are no default minimum or maximum block sizes, and
> therefore they are effectively limited to "1 byte" for the minimum block
> size, and "the size of the device" for the maximum block size.

We have existing servers that don't comply with this; I'm trying to
approach it from the angle of:

- if the server advertises, then that is the limit
- if the client and server communicate out of band, then use those limits
- if the server does not advertise, then the server MUST support a
minimum block size of 1 (although not all existing servers do, they can
be considered buggy), and the client SHOULD NOT send requests smaller
than 512 unless the export size proves a smaller block size is in use
(that is, if export size % 512 is non-zero, we know the server supports
a smaller block size in order to access those tail bytes), but the
client MAY still attempt to send smaller alignments and hope that the
server is not buggy

> 
> I do agree that it might be advantageous for the server to announce such
> minimum and maximum sizes, but I don't think that defining defaults that
> differ from what historically has been the effective default is the
> right way to go.
> 

But that's my argument - historically, there ARE servers which insist on
512 or even 4096-byte alignment, and which misbehave (close the
connection, return an error, or possibly even corrupt data) rather than
do read-modify-write on misaligned requests.  And qemu's server
certainly kills connections on attempts to exceed 32M in a single
transaction.  So while the theoretical server should be unlimited unless
advertised, the practical client should stick within sane limits rather
than attempting to exploit unlimited.

I'm open to suggestions on wording (such as using SHOULD rather than
MUST when a value is not advertised, so that existing implementations
are merely poorer quality of implementation rather than in violation of
the spec).

> Therefore, I would like this to say that unless you announce
> differently, the maximum block size is the size of the device, and the
> minimum block size is 1 byte.
> 
> Having a default for preferred block size sounds sane, although it might
> be better to switch it to 4096 (which is what most conversations seem to
> use today) rather than 512.

Except that I suggested 65k, not 4096, as the default preferred block
size.  Even though 4096 may be atomic, it is not as efficient; and in at
least qemu's case, the default qcow2 file sizing uses 32k clusters as
its preferred I/O size.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


Re: [Nbd] [Qemu-devel] [PATCH v2] doc: Add NBD_CMD_BLOCK_STATUS extension

2016-04-13 Thread Eric Blake
On 04/13/2016 06:38 AM, Pavel Borzenkov wrote:
>> I'm also starting to think that it is worth FIRST documenting an
>> extension for advertising block sizes, so that we can then couch
>> BLOCK_STATUS in those terms (a server MUST NOT subdivide status into
>> finer granularity than the advertised block sizes).
> 
> Why do you need to operate with blocks instead of list of extents?

It's still a list of extents, just that having a definition of block
sizes means that we can then require that the list of extents will not
be subdivided smaller than a particular block size (so the client
doesn't have to worry about a server overwhelming it with extents
covering 1 byte each).

> What benefits will this approach provide for a client or a server?
> 
> Are you still working on the spec? I can update the patch with
> information about server-side limit/beyond request's length replies and
> post v3, so that things keep moving forward.

You're welcome to post a v3, if you don't want to wait for me to get
around to it.  There's a lot going on in the spec right now, and I'm
hoping to help flush some of the pending patches first.


-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


Re: [Nbd] [Qemu-devel] [PATCH v2] doc: Add NBD_CMD_BLOCK_STATUS extension

2016-04-13 Thread Pavel Borzenkov
Hi Eric,

On Thu, Apr 07, 2016 at 10:10:58AM -0600, Eric Blake wrote:
> On 04/07/2016 04:38 AM, Vladimir Sementsov-Ogievskiy wrote:
> > On 05.04.2016 16:43, Paolo Bonzini wrote:
> >>
> >> On 05/04/2016 06:05, Kevin Wolf wrote:
> >>> The options I can think of is adding a request field "max number of
> >>> descriptors" or a flag "only single descriptor" (with the assumption
> >>> that clients always want one or unlimited), but maybe you have a better
> >>> idea.
> >> I think a limit is better.  Even if the client is ultimately going to
> >> process the whole file, it may take a very long time and space to
> >> retrieve all the descriptors in one go.  Rather than query e.g. 16GB at
> >> a time, I think it's simpler to put a limit of 1024 descriptors or so.
> >>
> >> Paolo
> >>
> > 
> > I vote for the limit too. More over, I think, there should be two sides
> > limit:
> > 
> > 1. The client can specify the limit, so server should not return more
> > extents than requested. Of course, server should chose sequential
> > extents from the beginning of requested range.
> 
> For the client to request a limit would entail that we enhance the
> protocol to allow structured requests (where a wire-sniffer would know
> how many bytes to read for the client's additional data, even if it does
> not understand the extension's semantics).  Might not be a bad idea to
> have this in the long run, but so far I've been reluctant to bite the
> bullet.
> 
> > 2. Server side limit: if client asked too many extents or not specified
> > a limit at all, server should not return all extents, but only 1024 (for
> > ex.) from the beginning of the range.
> 
> Okay, I'm fairly convinced now that letting the server limit the reply
> is a good thing, and that one doesn't require a structured request from
> the client.  Since we just recently documented that strings should be no
> more than 4096 bytes, and my v2 proposal used 8 bytes per descriptor,
> maybe a good way to enforce a similar limit would be:
> 
> The server MAY choose to send fewer descriptors than what would describe
> the full extent of the client's request, but MUST send at least one
> descriptor unless an error is reported.  The server MUST NOT send more
> than 512 descriptors, even if that does not completely describe the
> client's requested length.
> 
> That way, a client in general should never expect more than ~4096 bytes
> + overhead on any server reply except a reply to NBD_CMD_READ, and can
> therefore utilize stack allocation for all other replies (if we do this,
> maybe we should make a hard rule that all future protocol extensions,
> other than NBD_CMD_READ, will guarantee that a reply has a bounded size)
> 
> I also think it may be okay to let the server reply with MORE data than
> the client requested, but only as long as it does not result in any
> extra descriptors (that is, only the last descriptor can result in a
> length beyond the client's request).  For example, if the client asks
> for block status of 1M of the file, but the server can conveniently
> learn via lseek(SEEK_HOLE) or other means that there are 2M of data
> before status changes, then there's no reason to force the server to
> throw away the information about the 1M beyond the client's read, and
> the client might even be able to be more efficient in later requests.
> 
> > 2.1 And/or, why not allow the server use the power of structured reply
> > and send several reply chunks? Why did you forbid this? (if I correctly
> > understand "This chunk type MUST appear at most once in a structured
> > reply.")
> 
> If we allow more than one chunk, then either every chunk has to include
> an offset (more traffic over the wire), or the chunks have to be sent in
> a particular order (we aren't gaining any benefits that NBD_CMD_READ
> gains by allowing out-of-order transmission).  It's also more work for
> the client to reconstruct if it has to reassemble; with NBD_CMD_READ,
> the payload is dominated by the data being read, and you can pwrite()
> the data into its final location as the client; but with
> NBD_CMD_BLOCK_STATUS, the payload is dominated by the metadata and we
> want to keep it minimal; and there is no convenient command for the
> client to reassemble the information if received out of order.
> 
> Allowing for a short reply seems to be worth doing, but allowing for
> multiple reply chunks seems not worth the risk.
> 
> I'm also starting to think that it is worth FIRST documenting an
> extension for advertising block sizes, so that we can then couch
> BLOCK_STATUS in those terms (a server MUST NOT subdivide status into
> finer granularity than the advertised block sizes).

Why do you need to operate with blocks instead of list of extents?
What benefits will this approach provide for a client or a server?

Are you still working on the spec? I can update the patch with
information about server-side limit/beyond request's length replies and
post v3, so that things keep moving forward.

-- 

Re: [Nbd] [PATCH] Docs: improve description of disconnection methods

2016-04-13 Thread Alex Bligh
Wouter,

On 13 Apr 2016, at 12:44, Wouter Verhelst  wrote:

> Hi Alex,
> 
> On Wed, Apr 13, 2016 at 11:25:02AM +0100, Alex Bligh wrote:
>> Wouter,
>> 
 +A client MAY use a soft disconnect to terminate the session
 +whenever it wishes, provided that there are no outstanding
 +replies to options.
>>> 
>>> NAK. A client MAY use a soft disconnect *at any time*, but the server
>>> MUST NOT act upon it until there are no outstanding replies, and the
>>> client MUST NOT send any further options after sending NBD_OPT_ABORT.
>>> 
>>> (same for CMD_DISC)
>> 
>> This gets to the root of the unresolved issues I think. I suspect
>> the answer may be different for NBD_OPT_ABORT and NBD_CMD_DISC.
>> 
>> NBD commands are asynchronous and can be replied and acted on
>> in any order (so, from the document):
>> 
>>  "The server MAY process commands out of order, and MAY reply
>>   out of order"
>> 
>> I wouldn't want to loose that. So if the client sends NBD_CMD_DISC
>> without waiting for all his inflight commands to complete, those
>> inflight commands may not be executed at all, because the server
>> is free to process commands in any order. It's going to make
>> server design very awkward if you can only process /some/ commands
>> out of order.
> 
> It's actually fairly easy. Current nbd-server does this
> (mainloop_threaded):
> 
>if(req->type == NBD_CMD_DISC) {
>g_thread_pool_free(tpool, FALSE, TRUE);
>return 0;
>}
> 
> The "return 0" causes it to exit mainloop_threaded, which causes the
> server to do its final cleanup and exit.
> 
> The second argument is "immediate", which is documented like so:
> 
>  If immediate is TRUE, no new task is processed for pool. Otherwise pool is
>  not freed before the last task is processed. Note however, that no thread of
>  this pool is interrupted while processing a task. Instead at least all still
>  running threads can finish their tasks before the pool is freed.
> 
> IOW (since we use FALSE), we finish whatever outstanding requests are
> queued, and then close the TCP connection.
> 
> That's all the handling that NBD_CMD_DISC (currently) requires.
> 
> What's awkward about that?

Well, firstly not everyone uses your threading mechanism.

Secondly I think you are doing exactly what I said below. You
are processing it immediately, but waiting for all commands
to complete - "no thread of this pool is interrupted before
processing a task". I can't recall whether in nbd-server
that actually means the replies are sent - if I remember it
has a mutex on the socket for that, rather than a separate
non-pooled reply sending thread (which is what I have).

> Note that NBD_CMD_DISC is the *only* command for which I think it makes
> sense to have ordering requirements. Everything else should be fair
> game. We could perhaps make that more explicit in the "Ordering of
> messages and writes" section?

I think that would be pretty foul. Especially as you actually
seem to do exactly what I suggest below!

>> Another alternative would be to make the server
>> wait for all commands to complete before acting on the disconnect
>> (as opposed to or in addition to making the client wait to send
>> it).

So isn't this what you are doing? Waiting (in g_thread_pool_free)
for the existing requests to finish?

That's far better than saying mucking around with the
statement on ordering.

>> I'm reasonably relaxed about which one we do, but I think
>> we should do one or the other (or at least say that if the
>> client sends NBD_CMD_DISC without waiting for commands to complete
>> then those commands must not be executed).
> 
> I think that is obviously wrong, and we should not say that.

I meant that if it was permitted behaviour, we should document
it. I agree that I'd like it not to be permitted behaviour!

>> There are thus various choices for NBD_CMD_DISC.
>> 
>> I think the option haggling phase is different (or rather need
>> not be the same). Fundamentally options MUST be processed in
>> the order they are issued, and there is only ever one in
>> flight at a time. My understanding is (though perhaps this
>> not explicit in the document) that the client should not be
>> sending ANY option until it has got a reply to the last one.
> 
> Well, the text already says
> 
>  As there is no unique number for client requests, clients who want to
>  differentiate between answers to two instances of the same option
>  during any negotiation must make sure they've seen the answer to an
>  outstanding request before sending the next one of the same type. The
>  server MAY send replies in the order that the requests were received,
>  but is not required to.
> 
> So no, you're wrong here, too ;-)

Yuck! I'd much rather this was a serial process (or at least lock-step).
I don't know of any server that doesn't implement it that way.
Basically the current semantics say "well, you can send what you like
but unless you 

Re: [Nbd] [PATCH] Docs: improve description of disconnection methods

2016-04-13 Thread Wouter Verhelst
Hi Alex,

On Wed, Apr 13, 2016 at 11:25:02AM +0100, Alex Bligh wrote:
> Wouter,
> 
> >> +A client MAY use a soft disconnect to terminate the session
> >> +whenever it wishes, provided that there are no outstanding
> >> +replies to options.
> > 
> > NAK. A client MAY use a soft disconnect *at any time*, but the server
> > MUST NOT act upon it until there are no outstanding replies, and the
> > client MUST NOT send any further options after sending NBD_OPT_ABORT.
> > 
> > (same for CMD_DISC)
> 
> This gets to the root of the unresolved issues I think. I suspect
> the answer may be different for NBD_OPT_ABORT and NBD_CMD_DISC.
> 
> NBD commands are asynchronous and can be replied and acted on
> in any order (so, from the document):
>  
>   "The server MAY process commands out of order, and MAY reply
>out of order"
> 
> I wouldn't want to loose that. So if the client sends NBD_CMD_DISC
> without waiting for all his inflight commands to complete, those
> inflight commands may not be executed at all, because the server
> is free to process commands in any order. It's going to make
> server design very awkward if you can only process /some/ commands
> out of order.

It's actually fairly easy. Current nbd-server does this
(mainloop_threaded):

if(req->type == NBD_CMD_DISC) {
g_thread_pool_free(tpool, FALSE, TRUE);
return 0;
}

The "return 0" causes it to exit mainloop_threaded, which causes the
server to do its final cleanup and exit.

The second argument is "immediate", which is documented like so:

  If immediate is TRUE, no new task is processed for pool. Otherwise pool is
  not freed before the last task is processed. Note however, that no thread of
  this pool is interrupted while processing a task. Instead at least all still
  running threads can finish their tasks before the pool is freed.

IOW (since we use FALSE), we finish whatever outstanding requests are
queued, and then close the TCP connection.

That's all the handling that NBD_CMD_DISC (currently) requires.

What's awkward about that?

Note that NBD_CMD_DISC is the *only* command for which I think it makes
sense to have ordering requirements. Everything else should be fair
game. We could perhaps make that more explicit in the "Ordering of
messages and writes" section?

> Another alternative would be to make the server
> wait for all commands to complete before acting on the disconnect
> (as opposed to or in addition to making the client wait to send
> it). I'm reasonably relaxed about which one we do, but I think
> we should do one or the other (or at least say that if the
> client sends NBD_CMD_DISC without waiting for commands to complete
> then those commands must not be executed).

I think that is obviously wrong, and we should not say that.

> There are thus various choices for NBD_CMD_DISC.
> 
> I think the option haggling phase is different (or rather need
> not be the same). Fundamentally options MUST be processed in
> the order they are issued, and there is only ever one in
> flight at a time. My understanding is (though perhaps this
> not explicit in the document) that the client should not be
> sending ANY option until it has got a reply to the last one.

Well, the text already says

  As there is no unique number for client requests, clients who want to
  differentiate between answers to two instances of the same option
  during any negotiation must make sure they've seen the answer to an
  outstanding request before sending the next one of the same type. The
  server MAY send replies in the order that the requests were received,
  but is not required to.

So no, you're wrong here, too ;-)

Since the option reply contains the request type, too, clients can
differentiate between two requests of different type, but not two
requests of the same type.

> Certainly I know of no servers which can process options in
> parallel, and as we don't have an 'Id' field to line up
> replies they would have to be processed sequentially anyway.
> I think we should document (somewhere) that the client MUST NOT
> send an option until it has received a final reply to the previous
> option. If we don't do that, we should document how concurrency
> with options is meant to work.

We already do :)

> So for this case I think it is completely correct that NBD_OPT_ABORT
> must not (like any other option) be sent until there are no
> outstanding *option* replies.
> 
> Let's see if we can resolve this one on list.

I think the current text is clear enough, and we don't need to resolve
anything (although clarifying some things might make sense).

> >> +terminate the session. In the client's case, if it wishes to
> >> +do so it MUST use soft disconnect. In the server's case it
> >> +MUST (save where set out above) simply error inbound options until
> >> +the client gets the hint that it is unwelcome.
> > 
> > It might be good to add a "NBD_REP_ERR_NOSERVICE" error, for "server
> > 

Re: [Nbd] [PATCH v2] doc: Add new NBD_REP_INFO reply, for advertising block size

2016-04-13 Thread Alex Bligh

On 13 Apr 2016, at 12:05, Alex Bligh  wrote:

>> Having a default for preferred block size sounds sane, although it might
>> be better to switch it to 4096 (which is what most conversations seem to
>> use today) rather than 512.
> 
> +1

Actually doubly +1, as this in theory allows for O_DIRECT type optimisation
with 4k page sizes.

-- 
Alex Bligh





--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


Re: [Nbd] [PATCH v2] doc: Add new NBD_REP_INFO reply, for advertising block size

2016-04-13 Thread Alex Bligh

On 13 Apr 2016, at 08:27, Wouter Verhelst  wrote:

> Currently, there are no default minimum or maximum block sizes, and
> therefore they are effectively limited to "1 byte" for the minimum block
> size, and "the size of the device" for the maximum block size.
> 
> I do agree that it might be advantageous for the server to announce such
> minimum and maximum sizes, but I don't think that defining defaults that
> differ from what historically has been the effective default is the
> right way to go.
> 
> Therefore, I would like this to say that unless you announce
> differently, the maximum block size is the size of the device, and the
> minimum block size is 1 byte.
> 
> Having a default for preferred block size sounds sane, although it might
> be better to switch it to 4096 (which is what most conversations seem to
> use today) rather than 512.

+1

-- 
Alex Bligh





--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


Re: [Nbd] [PATCH] Docs: improve description of disconnection methods

2016-04-13 Thread Alex Bligh
Eric,

Agree with the nits - many of them were from the mailing list
message which of course I then didn't check before copying
into the commit message.

Re the substance:

>> +* Transmission mode can be entered (by the client sending
>> +  `NBD_OPT_EXPORT_NAME` or by the server responding to an
>> +  `NBD_OPT_GO` with `NBD_REP_ACK`). This is documented
> 
> s/ACK/SERVER/

yep

> (although I may bite the bullet and create a new NBD_REP_INFO if we want
> the name to be optional, since NBD_OPT_[INFO/GO] is still experimental,
> as part of my rework on block size information)

indeed, but that's orthogonal.

>> +A client MAY use a soft disconnect to terminate the session
>> +whenever it wishes, provided that there are no outstanding
>> +replies to options.
> 
> Why the disclaimer on no outstanding replies?

See reply to Wouter who made the same point. Let's handle that
there.

>> +terminate the session. In the client's case, if it wishes to
>> +do so it MUST use soft disconnect. In the server's case it
>> +MUST (save where set out above) simply error inbound options until
>> +the client gets the hint that it is unwelcome.
> 
> so basically wait for either the client to give up and close first, or
> for the client to do something that is provably in violation of a MUST
> in the protocol so the server can close the connection.  Can a malicious
> client abuse this requirement to tie up a server as a denial of service?

Good point. I think we should give the server the right to disconnect
in a DoS situation. This is a bit like DoS protection for TCP violating
the TCP spec though.

>> +On a server shutdown, the server SHOULD wait for inflight
>> +requests to be serviced prior to initiating a hard disconnect.
> 
> Maybe a mention that the server MAY use error replies to speed up the
> processing of those requests, even if the command would normally succeed
> if termination weren't pending?

+1, and as Wouter suggested, use a different reply.

>> +The client MAY issue a soft disconnect at any time, but
>> +MUST wait until there are no inflight requests first.
> 
> Why MUST and not SHOULD?  Didn't Wouter have an example of a client that
> batches up its entire request sequence, including NBD_CMD_DISC, and
> sends that in bulk before waiting for any server replies?  I thought the
> goal was that the server MUST NOT react to NBD_CMD_DISC until all other
> pending requests have been dealt with, but don't necessarily see the
> reason why the client MUST NOT send NBD_CMD_DISC while requests are
> inflight.

The issue is that the server MAY process requests out of order. I thus
think such a client is foolhardy as the server MAY process the NBD_CMD_DISC
first. It depends whether that 'processing' includes 'waiting for all
the other commands'. Again, see my reply to Wouter - this is definitely
an area we need to sort out.

I agree though that MUST is too strong. I think I perhaps I'd say
it may send it if it wishes, but should remember the server can process
replies out of order.

>> - `NBD_OPT_ABORT` (2)
>> 
>> -The client desires to abort the negotiation and close the
>> -connection.
>> +The client desires to abort the negotiation and terminate the
>> +session. The server MUST reply with `NBD_REP_ACK`.
> 
> Maybe explicitly mention that the client MAY disconnect immediately
> rather than waiting to receive the response?

I'm saying the client MUST wait. But if it doesn't (meaning only
clients will be non-conformant) nothing is lost, particular as
this is only at option haggling stage. So it's more (as Wouter
said) "be aware that there may old non-compliant clients that
will not wait".

--
Alex Bligh






signature.asc
Description: Message signed with OpenPGP using GPGMail
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general


Re: [Nbd] [PATCH] Docs: improve description of disconnection methods

2016-04-13 Thread Wouter Verhelst
On Tue, Apr 12, 2016 at 08:31:33PM +0100, Alex Bligh wrote:
> Improve the documentation as per the mailing list discussion.
> Here's what we deciced (broadly).
> 
> * One side MAY drop the connection if the other end violates a
>  MUST condition.
> 
> * The server MUST drop the connection in the 'no way out' situations
>  during the negotiation phase (error on NBD_OPT_EXPORT_NAME, error
>  in negotiating text).
> 
> * The server SHOULD NOT otherwise drop the connection. It can wait
>  and error the next command. Clearly there are situations where
>  this is going to happen (e.g. server shutdown).
> 
> * If the server does need to drop the connection, it SHOULD wait
>  until there are no commands in-flight in transmission mode,
>  it possible.
> 
> * If he client is going to drop the the connection, then other
>  than in the event of a protocol violation or a 'no way out'
>  situation (e.g. TLS negotiation fails), it MUST use NBD_CMD_DISC
>  or NBD_OPT_ABORT
> 
> * We should tidy up the semantics and descriptions of NBD_CMD_DISC
>  and NBD_OPT_ABORT, viz replies or not to the latter, shutting
>  down TLS properly etc.
> 
> Other changes:
> 
> * Added a reply to NBD_OPT_ABORT. No harm if the client closes
>   the connection anyway.
> 
> * Said the offset and length fields in NBD_CMD_DISC MUST be zero.
>   Not doing so is a protocol violation and would only lead to ...
>   the connection being closed, so this is a useful tidy up.
> 
> * Introduced consistent terminology for disconnection throughout.
> 
> This patch applies on top of:
>   0001-docs-proto.md-Clarify-SHOULD-MUST-MAY-etc v7
> 
> Signed-off-by: Alex Bligh 
> ---
>  doc/proto.md | 143 
> ---
>  1 file changed, 107 insertions(+), 36 deletions(-)
> 
> diff --git a/doc/proto.md b/doc/proto.md
> index b88e054..db6b96d 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -122,7 +122,7 @@ C: 32 bits, flags
>  This completes the initial phase of negotiation; the client and server
>  now both know they understand the first version of the newstyle
>  handshake, with no options. The client SHOULD ignore any handshake flags
> -it does not recognize, while the server MUST close the connection if
> +it does not recognize, while the server MUST close the TCP connection if
>  it does not recognize the client's flags.  What follows is a repeating
>  group of options. In non-fixed newstyle only one option can be set
>  (`NBD_OPT_EXPORT_NAME`), and it is not optional.
> @@ -150,8 +150,8 @@ S: 16 bits, transmission flags
>  S: 124 bytes, zeroes (reserved) (unless `NBD_FLAG_C_NO_ZEROES` was
> negotiated by the client)  
>  
> -If the server is unwilling to allow the export, it SHOULD close the
> -connection.
> +If the server is unwilling to allow the export, it MUST terminate
> +the session.
>  
>  The reason that the flags field is 16 bits large and not 32 as in the
>  oldstyle negotiation is that there are now 16 bits of transmission flags,
> @@ -201,22 +201,60 @@ request before sending the next one of the same type. 
> The server MAY
>  send replies in the order that the requests were received, but is not
>  required to.
>  
> + Termination of the session during option haggling
> +
> +There are three possible mechanisms to end option haggling:
> +
> +* Transmission mode can be entered (by the client sending
> +  `NBD_OPT_EXPORT_NAME` or by the server responding to an
> +  `NBD_OPT_GO` with `NBD_REP_ACK`). This is documented
> +  elsewhere.
> +
> +* The client can send (and the server can reply to) an
> +  `NBD_OPT_ABORT`. This MUST be followed by the client
> +  shutting down TLS (if it is running), and the client
> +  dropping the connection. This is referred to as
> +  'initiating a soft disconnect'; soft disconnects can
> +  only be initiated by the client.
> +
> +* The client or the server can disconnect the TCP session
> +  without activity at the NBD protocol level. If TLS is
> +  negotiated, the party intitiating the transaction SHOULD
> +  shutdown TLS first if it is running. This is referrred
> +  to as 'initiating a hard disconnect'.
> +
> +This section concerns the second and third of these, together
> +called 'terminating the session', and under which circumstances
> +they are valid.
> +
> +If either the client or the server detects a violation of a
> +mandatory condition ('MUST' etc.) by the other party, it MAY
> +initiate a hard discconect.
> +
> +A client MAY use a soft disconnect to terminate the session
> +whenever it wishes, provided that there are no outstanding
> +replies to options.

NAK. A client MAY use a soft disconnect *at any time*, but the server
MUST NOT act upon it until there are no outstanding replies, and the
client MUST NOT send any further options after sending NBD_OPT_ABORT.

(same for CMD_DISC)

[...]
> +terminate the session. In the client's case, if it wishes to
> +do so it MUST use soft disconnect. In the server's case it
> +MUST (save where