Re: [dpdk-dev] [RFC v2] doc compression API for DPDK

Verma, Shally Thu, 15 Feb 2018 23:17:43 -0800

Hi Fiona, Ahmed

>-----Original Message-----
>From: Ahmed Mansour [mailto:ahmed.mans...@nxp.com]
>Sent: 16 February 2018 02:40
>To: Trahe, Fiona <fiona.tr...@intel.com>; Verma, Shally 
><shally.ve...@cavium.com>; dev@dpdk.org
>Cc: Athreya, Narayana Prasad <narayanaprasad.athr...@cavium.com>; Gupta, 
>Ashish <ashish.gu...@cavium.com>; Sahu, Sunila
><sunila.s...@cavium.com>; De Lara Guarch, Pablo 
><pablo.de.lara.gua...@intel.com>; Challa, Mahipal
><mahipal.cha...@cavium.com>; Jain, Deepak K <deepak.k.j...@intel.com>; Hemant 
>Agrawal <hemant.agra...@nxp.com>; Roy
>Pledge <roy.ple...@nxp.com>; Youri Querry <youri.querr...@nxp.com>
>Subject: Re: [RFC v2] doc compression API for DPDK
>
>On 2/15/2018 1:47 PM, Trahe, Fiona wrote:
>> Hi Shally, Ahmed,
>> Sorry for the delay in replying,
>> Comments below
>>
>>> -----Original Message-----
>>> From: Verma, Shally [mailto:shally.ve...@cavium.com]
>>> Sent: Wednesday, February 14, 2018 7:41 AM
>>> To: Ahmed Mansour <ahmed.mans...@nxp.com>; Trahe, Fiona 
>>> <fiona.tr...@intel.com>;
>>> dev@dpdk.org
>>> Cc: Athreya, Narayana Prasad <narayanaprasad.athr...@cavium.com>; Gupta, 
>>> Ashish
>>> <ashish.gu...@cavium.com>; Sahu, Sunila <sunila.s...@cavium.com>; De Lara 
>>> Guarch, Pablo
>>> <pablo.de.lara.gua...@intel.com>; Challa, Mahipal 
>>> <mahipal.cha...@cavium.com>; Jain, Deepak K
>>> <deepak.k.j...@intel.com>; Hemant Agrawal <hemant.agra...@nxp.com>; Roy 
>>> Pledge
>>> <roy.ple...@nxp.com>; Youri Querry <youri.querr...@nxp.com>
>>> Subject: RE: [RFC v2] doc compression API for DPDK
>>>
>>> Hi Ahmed,
>>>
>>>> -----Original Message-----
>>>> From: Ahmed Mansour [mailto:ahmed.mans...@nxp.com]
>>>> Sent: 02 February 2018 01:53
>>>> To: Trahe, Fiona <fiona.tr...@intel.com>; Verma, Shally 
>>>> <shally.ve...@cavium.com>; dev@dpdk.org
>>>> Cc: Athreya, Narayana Prasad <narayanaprasad.athr...@cavium.com>; Gupta, 
>>>> Ashish
>>> <ashish.gu...@cavium.com>; Sahu, Sunila
>>>> <sunila.s...@cavium.com>; De Lara Guarch, Pablo 
>>>> <pablo.de.lara.gua...@intel.com>; Challa,
>>> Mahipal
>>>> <mahipal.cha...@cavium.com>; Jain, Deepak K <deepak.k.j...@intel.com>; 
>>>> Hemant Agrawal
>>> <hemant.agra...@nxp.com>; Roy
>>>> Pledge <roy.ple...@nxp.com>; Youri Querry <youri.querr...@nxp.com>
>>>> Subject: Re: [RFC v2] doc compression API for DPDK
>>>>
>>>> On 1/31/2018 2:03 PM, Trahe, Fiona wrote:
>>>>> Hi Ahmed, Shally,
>>>>>
>>>>> ///snip///
>>>>>>>>>>> D.1.1 Stateless and OUT_OF_SPACE
>>>>>>>>>>> ------------------------------------------------
>>>>>>>>>>> OUT_OF_SPACE is a condition when output buffer runs out of space
>>>>>>>> and
>>>>>>>>>> where PMD still has more data to produce. If PMD run into such
>>>>>>>> condition,
>>>>>>>>>> then it's an error condition in stateless processing.
>>>>>>>>>>> In such case, PMD resets itself and return with status
>>>>>>>>>> RTE_COMP_OP_STATUS_OUT_OF_SPACE with produced=consumed=0
>>>>>>>> i.e.
>>>>>>>>>> no input read, no output written.
>>>>>>>>>>> Application can resubmit an full input with larger output buffer 
>>>>>>>>>>> size.
>>>>>>>>>> [Ahmed] Can we add an option to allow the user to read the data that
>>>>>>>> was
>>>>>>>>>> produced while still reporting OUT_OF_SPACE? this is mainly useful 
>>>>>>>>>> for
>>>>>>>>>> decompression applications doing search.
>>>>>>>>> [Shally] It is there but applicable for stateful operation type 
>>>>>>>>> (please refer to
>>>>>>>> handling out_of_space under
>>>>>>>>> "Stateful Section").
>>>>>>>>> By definition, "stateless" here means that application (such as 
>>>>>>>>> IPCOMP)
>>>>>>>> knows maximum output size
>>>>>>>>> guaranteedly and ensure that uncompressed data size cannot grow more
>>>>>>>> than provided output buffer.
>>>>>>>>> Such apps can submit an op with type = STATELESS and provide full 
>>>>>>>>> input,
>>>>>>>> then PMD assume it has
>>>>>>>>> sufficient input and output and thus doesn't need to maintain any 
>>>>>>>>> contexts
>>>>>>>> after op is processed.
>>>>>>>>> If application doesn't know about max output size, then it should 
>>>>>>>>> process it
>>>>>>>> as stateful op i.e. setup op
>>>>>>>>> with type = STATEFUL and attach a stream so that PMD can maintain
>>>>>>>> relevant context to handle such
>>>>>>>>> condition.
>>>>>>>> [Fiona] There may be an alternative that's useful for Ahmed, while 
>>>>>>>> still
>>>>>>>> respecting the stateless concept.
>>>>>>>> In Stateless case where a PMD reports OUT_OF_SPACE in decompression
>>>>>>>> case
>>>>>>>> it could also return consumed=0, produced = x, where x>0. X indicates 
>>>>>>>> the
>>>>>>>> amount of valid data which has
>>>>>>>>  been written to the output buffer. It is not complete, but if an 
>>>>>>>> application
>>>>>>>> wants to search it it may be sufficient.
>>>>>>>> If the application still wants the data it must resubmit the whole 
>>>>>>>> input with a
>>>>>>>> bigger output buffer, and
>>>>>>>>  decompression will be repeated from the start, it
>>>>>>>>  cannot expect to continue on as the PMD has not maintained state, 
>>>>>>>> history
>>>>>>>> or data.
>>>>>>>> I don't think there would be any need to indicate this in 
>>>>>>>> capabilities, PMDs
>>>>>>>> which cannot provide this
>>>>>>>> functionality would always return produced=consumed=0, while PMDs which
>>>>>>>> can could set produced > 0.
>>>>>>>> If this works for you both, we could consider a similar case for 
>>>>>>>> compression.
>>>>>>>>
>>>>>>> [Shally] Sounds Fine to me. Though then in that case, consume should 
>>>>>>> also be updated to actual
>>>>>> consumed by PMD.
>>>>>>> Setting consumed = 0 with produced > 0 doesn't correlate.
>>>>>> [Ahmed]I like Fiona's suggestion, but I also do not like the implication
>>>>>> of returning consumed = 0. At the same time returning consumed = y
>>>>>> implies to the user that it can proceed from the middle. I prefer the
>>>>>> consumed = 0 implementation, but I think a different return is needed to
>>>>>> distinguish it from OUT_OF_SPACE that the use can recover from. Perhaps
>>>>>> OUT_OF_SPACE_RECOVERABLE and OUT_OF_SPACE_TERMINATED. This also allows
>>>>>> future PMD implementations to provide recover-ability even in STATELESS
>>>>>> mode if they so wish. In this model STATELESS or STATEFUL would be a
>>>>>> hint for the PMD implementation to make optimizations for each case, but
>>>>>> it does not force the PMD implementation to limit functionality if it
>>>>>> can provide recover-ability.
>>>>> [Fiona] So you're suggesting the following:
>>>>> OUT_OF_SPACE - returned only on stateful operation. Not an error. 
>>>>> Op.produced
>>>>>     can be used and next op in stream should continue on from 
>>>>> op.consumed+1.
>>>>> OUT_OF_SPACE_TERMINATED - returned only on stateless operation.
>>>>>     Error condition, no recovery possible.
>>>>>     consumed=produced=0. Application must resubmit all input data with
>>>>>     a bigger output buffer.
>>>>> OUT_OF_SPACE_RECOVERABLE - returned only on stateless operation, some 
>>>>> recovery possible.
>>>>>      - consumed = 0, produced > 0. Application must resubmit all input 
>>>>> data with
>>>>>         a bigger output buffer. However in decompression case, data up to 
>>>>> produced
>>>>>         in dst buffer may be inspected/searched. Never happens in 
>>>>> compression
>>>>>         case as output data would be meaningless.
>>>>>      - consumed > 0, produced > 0. PMD has stored relevant state and 
>>>>> history and so
>>>>>         can convert to stateful, using op.produced and continuing from 
>>>>> consumed+1.
>>>>> I don't expect our PMDs to use this last case, but maybe this works for 
>>>>> others?
>>>>> I'm not convinced it's not just adding complexity. It sounds like a 
>>>>> version of stateful
>>>>> without a stream, and maybe less efficient?
>>>>> If so should it respect the FLUSH flag? Which would have been FULL or 
>>>>> FINAL in the op.
>>>>> Or treat it as FLUSH_NONE or SYNC? I don't know why an application would 
>>>>> not
>>>>> simply have submitted a STATEFUL request if this is the behaviour it 
>>>>> wants?
>>>> [Ahmed] I was actually suggesting the removal of OUT_OF_SPACE entirely
>>>> and replacing it with
>>>> OUT_OF_SPACE_TERMINATED - returned only on stateless operation.
>>>>        Error condition, no recovery possible.
>>>>        - consumed=0 produced=amount of data produced. Application must
>>>> resubmit all input data with
>>>>          a bigger output buffer to process all of the op
>>>> OUT_OF_SPACE_RECOVERABLE -  Normally returned on stateful operation. Not
>>>> an error. Op.produced
>>>>    can be used and next op in stream should continue on from op.consumed+1.
>>>>        -  consumed > 0, produced > 0. PMD has stored relevant state and
>>>> history and so
>>>>            can continue using op.produced and continuing from consumed+1.
>>>>
>>>> We would not return OUT_OF_SPACE_RECOVERABLE in stateless mode in our
>>>> implementation either.
>>>>
>>>> Regardless of speculative future PMDs. The more important aspect of this
>>>> for today is that the return status clearly determines
>>>> the meaning of "consumed". If it is RECOVERABLE then consumed is
>>>> meaningful. if it is TERMINATED then consumed in meaningless.
>>>> This way we take away the ambiguity of having OUT_OF_SPACE mean two
>>>> different user work flows.
>>>>
>>>> A speculative future PMD may be designed to return RECOVERABLE for
>>>> stateless ops that are attached to streams.
>>>> A future PMD may look to see if an op has a stream is attached and write
>>>> out the state there and go into recoverable mode.
>>>> in essence this leaves the choice up to the implementation and allows
>>>> the PMD to take advantage of stateless optimizations
>>>> so long as a "RECOVERABLE" scenario is rarely hit. The PMD will dump
>>>> context as soon as it fully processes an op. It will only
>>>> write context out in cases where the op chokes.
>>>> This futuristic PMD should ignore the FLUSH since this STATELESS mode as
>>>> indicated by the user and optimize
>>> [Shally] IMO, it looks okay to have two separate return code TERMINATED and 
>>> RECOVERABLE with
>>> definition as you mentioned and seem doable.
>>> So then it mean all following conditions:
>>> a. stateless with flush = full/final, no stream pointer provided , PMD can 
>>> return TERMINATED i.e. user
>>> has to start all over again, it's a failure (as in current definition)
>>> b. stateless with flush = full/final, stream pointer provided, here it's up 
>>> to PMD to return either
>>> TERMINATED or RECOVERABLE depending upon its ability (note if Recoverable, 
>>> then PMD will maintain
>>> states in stream pointer)
>>> c. stateful with flush = full / NO_SYNC, stream pointer always there, PMD 
>>> will
>>> TERMINATED/RECOVERABLE depending on STATEFUL_COMPRESSION/DECOMPRESSION 
>>> feature flag
>>> enabled or not
>> [Fiona] I don't think the flush flag is relevant - it could be out of space 
>> on any flush flag, and if out of space
>> should ignore the flush flag.
>> Is there a need for TERMINATED? - I didn't think it would ever need to be 
>> returned in stateful case.
>>  Why the ref to feature flag? If a PMD doesn't support a feature I think it 
>> should fail the op - not with
>>  out-of space, but unsupported or similar. Or it would fail on stream 
>> creation.
>[Ahmed] Agreed with Fiona. The flush flag only matters on success. By
>definition the PMD should return OUT_OF_SPACE_RECOVERABLE in stateful
>mode when it runs out of space.
>@Shally If the user did not provide a stream, then the PMD should
>probably return TERMINATED every time. I am not sure we should make a
>"really smart" PMD which returns RECOVERABLE even if no stream pointer
>was given. In that case the PMD must give some ID back to the caller
>that the caller can use to "recover" the op. I am not sure how it would
>be implemented in the PMD and when does the PMD decide to retire streams
>belonging to dead ops that the caller decided not to "recover".
>>
>>> and one more exception case is:
>>> d. stateless with flush = full, no stream pointer provided, PMD can return 
>>> RECOVERABLE i.e. PMD
>>> internally maintained that state somehow and consumed & produced > 0, so 
>>> user can start consumed+1
>>> but there's restriction on user not to alter or change op until it is fully 
>>> processed?!
>> [Fiona] Why the need for this case?
>> There's always a restriction on user not to alter or change op until it is 
>> fully processed.
>> If a PMD can do this - why doesn't it create a stream when that API is 
>> called - and then it's same as b?
>[Ahmed] Agreed. The user should not touch an op once enqueued until they
>receive it in dequeue. We ignore the flush in stateless mode. We assume
>it to be final every time.


[Shally] Agreed and am not in favour of supporting such implementation either. 
Just listed out different possibilities up here to better visualise Ahmed 
requirements/applicability of TERMINATED and RECOVERABLE.

>>
>>> API currently takes care of case a and c, and case b can be supported if 
>>> specification accept another
>>> proposal which mention optional usage of stream with stateless.
>> [Fiona] API has this, but as we agreed, not optional to call the 
>> create_stream() with an op_type
>> parameter (stateful/stateless). PMD can return NULL or provide a stream, if 
>> the latter then that
>> stream must be attached to ops.
>>
>>  Until then API takes no difference to
>>> case b and c i.e. we can have op such as,
>>> - type= stateful with flush = full/final, stream pointer provided, PMD can 
>>> return
>>> TERMINATED/RECOVERABLE according to its ability
>>>
>>> Case d , is something exceptional, if there's requirement in PMDs to 
>>> support it, then believe it will be
>>> doable with concept of different return code.
>>>
>> [Fiona] That's not quite how I understood it. Can it be simpler and only 
>> following cases?
>> a. stateless with flush = full/final, no stream pointer provided , PMD can 
>> return TERMINATED i.e. user
>>     has to start all over again, it's a failure (as in current definition).
>>     consumed = 0, produced=amount of data produced. This is usually 0, but 
>> in decompression
>>     case a PMD may return > 0 and application may find it useful to inspect 
>> that data.
>> b. stateless with flush = full/final, stream pointer provided, here it's up 
>> to PMD to return either
>>     TERMINATED or RECOVERABLE depending upon its ability (note if 
>> Recoverable, then PMD will maintain
>>     states in stream pointer)
>> c. stateful with flush = any, stream pointer always there, PMD will return 
>> RECOVERABLE.
>>     op.produced can be used and next op in stream should continue on from 
>> op.consumed+1.
>>     Consumed=0, produced=0 is an unusual but allowed case. I'm not sure if 
>> it could ever happen, but
>>     no need to change state to TERMINATED in this case. There may be useful 
>> state/history
>>     stored in the PMD, even though no output produced yet.
>[Ahmed] Agreed
[Shally] Sounds good.

>>
>>>>>>>>>>> D.2 Compression API Stateful operation
>>>>>>>>>>> ----------------------------------------------------------
>>>>>>>>>>>  A Stateful operation in DPDK compression means application invokes
>>>>>>>>>> enqueue burst() multiple times to process related chunk of data 
>>>>>>>>>> either
>>>>>>>>>> because
>>>>>>>>>>> - Application broke data into several ops, and/or
>>>>>>>>>>> - PMD ran into out_of_space situation during input processing
>>>>>>>>>>>
>>>>>>>>>>> In case of either one or all of the above conditions, PMD is 
>>>>>>>>>>> required to
>>>>>>>>>> maintain state of op across enque_burst() calls and
>>>>>>>>>>> ops are setup with op_type RTE_COMP_OP_STATEFUL, and begin with
>>>>>>>>>> flush value = RTE_COMP_NO/SYNC_FLUSH and end at flush value
>>>>>>>>>> RTE_COMP_FULL/FINAL_FLUSH.
>>>>>>>>>>> D.2.1 Stateful operation state maintenance
>>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>>> It is always an ideal expectation from application that it should 
>>>>>>>>>>> parse
>>>>>>>>>> through all related chunk of source data making its mbuf-chain and
>>>>>>>> enqueue
>>>>>>>>>> it for stateless processing.
>>>>>>>>>>> However, if it need to break it into several enqueue_burst() calls, 
>>>>>>>>>>> then
>>>>>>>> an
>>>>>>>>>> expected call flow would be something like:
>>>>>>>>>>> enqueue_burst( |op.no_flush |)
>>>>>>>>>> [Ahmed] The work is now in flight to the PMD.The user will call 
>>>>>>>>>> dequeue
>>>>>>>>>> burst in a loop until all ops are received. Is this correct?
>>>>>>>>>>
>>>>>>>>>>> deque_burst(op) // should dequeue before we enqueue next
>>>>>>>>> [Shally] Yes. Ideally every submitted op need to be dequeued. However
>>>>>>>> this illustration is specifically in
>>>>>>>>> context of stateful op processing to reflect if a stream is broken 
>>>>>>>>> into
>>>>>>>> chunks, then each chunk should be
>>>>>>>>> submitted as one op at-a-time with type = STATEFUL and need to be
>>>>>>>> dequeued first before next chunk is
>>>>>>>>> enqueued.
>>>>>>>>>
>>>>>>>>>>> enqueue_burst( |op.no_flush |)
>>>>>>>>>>> deque_burst(op) // should dequeue before we enqueue next
>>>>>>>>>>> enqueue_burst( |op.full_flush |)
>>>>>>>>>> [Ahmed] Why now allow multiple work items in flight? I understand 
>>>>>>>>>> that
>>>>>>>>>> occasionaly there will be OUT_OF_SPACE exception. Can we just
>>>>>>>> distinguish
>>>>>>>>>> the response in exception cases?
>>>>>>>>> [Shally] Multiples ops are allowed in flight, however condition is 
>>>>>>>>> each op in
>>>>>>>> such case is independent of
>>>>>>>>> each other i.e. belong to different streams altogether.
>>>>>>>>> Earlier (as part of RFC v1 doc) we did consider the proposal to 
>>>>>>>>> process all
>>>>>>>> related chunks of data in single
>>>>>>>>> burst by passing them as ops array but later found that as 
>>>>>>>>> not-so-useful for
>>>>>>>> PMD handling for various
>>>>>>>>> reasons. You may please refer to RFC v1 doc review comments for same.
>>>>>>>> [Fiona] Agree with Shally. In summary, as only one op can be processed 
>>>>>>>> at a
>>>>>>>> time, since each needs the
>>>>>>>> state of the previous, to allow more than 1 op to be in-flight at a 
>>>>>>>> time would
>>>>>>>> force PMDs to implement internal queueing and exception handling for
>>>>>>>> OUT_OF_SPACE conditions you mention.
>>>>>> [Ahmed] But we are putting the ops on qps which would make them
>>>>>> sequential. Handling OUT_OF_SPACE conditions would be a little bit more
>>>>>> complex but doable.
>>>>> [Fiona] In my opinion this is not doable, could be very inefficient.
>>>>> There may be many streams.
>>>>> The PMD would have to have an internal queue per stream so
>>>>> it could adjust the next src offset and length in the OUT_OF_SPACE case.
>>>>> And this may ripple back though all subsequent ops in the stream as each
>>>>> source len is increased and its dst buffer is not big enough.
>>>> [Ahmed] Regarding multi op OUT_OF_SPACE handling.
>>>> The caller would still need to adjust
>>>> the src length/output buffer as you say. The PMD cannot handle
>>>> OUT_OF_SPACE internally.
>>>> After OUT_OF_SPACE occurs, the PMD should reject all ops in this stream
>>>> until it gets explicit
>>>> confirmation from the caller to continue working on this stream. Any ops
>>>> received by
>>>> the PMD should be returned to the caller with status STREAM_PAUSED since
>>>> the caller did not
>>>> explicitly acknowledge that it has solved the OUT_OF_SPACE issue.
>>>> These semantics can be enabled by adding a new function to the API
>>>> perhaps stream_resume().
>>>> This allows the caller to indicate that it acknowledges that it has seen
>>>> the issue and this op
>>>> should be used to resolve the issue. Implementations that do not support
>>>> this mode of use
>>>> can push back immediately after one op is in flight. Implementations
>>>> that support this use
>>>> mode can allow many ops from the same session
>>>>
>>> [Shally] Is it still in context of having single burst where all op belongs 
>>> to one stream? If yes, I would still
>>> say it would add an overhead to PMDs especially if it is expected to work 
>>> closer to HW (which I think is
>>> the case with DPDK PMD).
>>> Though your approach is doable but why this all cannot be in a layer above 
>>> PMD? i.e. a layer above PMD
>>> can either pass one-op at a time with burst size = 1 OR can make chained 
>>> mbuf of input and output and
>>> pass than as one op.
>>> Is it just to ease applications of chained mbuf burden or do you see any 
>>> performance /use-case
>>> impacting aspect also?
>>>
>>> if it is in context where each op belong to different stream in a burst, 
>>> then why do we need
>>> stream_pause and resume? It is a expectations from app to pass more output 
>>> buffer with consumed + 1
>>> from next call onwards as it has already
>>> seen OUT_OF_SPACE.
>[Ahmed] Yes, this would add extra overhead to the PMD. Our PMD
>implementation rejects all ops that belong to a stream that has entered
>"RECOVERABLE" state for one reason or another. The caller must
>acknowledge explicitly that it has received news of the problem before
>the PMD allows this stream to exit "RECOVERABLE" state. I agree with you
>that implementing this functionality in the software layer above the PMD
>is a bad idea since the latency reductions are lost.

[Shally] Just reiterating, I rather meant other way around i.e. I see it easier 
to put all such complexity in a layer above PMD.

>This setup is useful in latency sensitive applications where the latency
>of buffering multiple ops into one op is significant. We found latency
>makes a significant difference in search applications where the PMD
>competes with software decompression.
>> [Fiona] I still have concerns with this and would not want to support in our 
>> PMD.
>> TO make sure I understand, you want to send a burst of ops, with several 
>> from same stream.
>> If one causes OUT_OF_SPACE_RECOVERABLE, then the PMD should not process any
>> subsequent ops in that stream.
>> Should it return them in a dequeue_burst() with status still NOT_PROCESSED?
>> Or somehow drop them? How?
>> While still processing ops form other streams.
>[Ahmed] This is exactly correct. It should return them with
>NOT_PROCESSED. Yes, the PMD should continue processing other streams.
>> As we want to offload each op to hardware with as little CPU processing as 
>> possible we
>> would not want to open up each op to see which stream it's attached to and
>> make decisions to do per-stream storage, or drop it, or bypass hw and 
>> dequeue without processing.
>[Ahmed] I think I might have missed your point here, but I will try to
>answer. There is no need to "cushion" ops in DPDK. DPDK should send ops
>to the PMD and the PMD should reject until stream_continue() is called.
>The next op to be sent by the user will have a special marker in it to
>inform the PMD to continue working on this stream. Alternatively the
>DPDK layer can be made "smarter" to fail during the enqueue by checking
>the stream and its state, but like you say this adds additional CPU
>overhead during the enqueue.
>I am curious. In a simple synchronous use case. How do we prevent users
>from putting multiple ops in flight that belong to a single stream? Do
>we just currently say it is undefined behavior? Otherwise we would have
>to check the stream and incur the CPU overhead.
>>
>> Maybe we could add a capability if this behaviour is important for you?
>> e.g. ALLOW_ENQUEUE_MULTIPLE_STATEFUL_OPS ?
>> Our PMD would set this to 0. And expect no more than one op from a stateful 
>> stream
>> to be in flight at any time.
>[Ahmed] That makes sense. This way the different DPDK implementations do
>not have to add extra checking for unsupported cases.

[Shally] @ahmed, If I summarise your use-case, this is how to want to PMD to 
support?
- a burst *carry only one stream* and all ops then assumed to be belong to that 
stream? (please note, here burst is not carrying more than one stream)
-PMD will submit one op at a time to HW? 
-if processed successfully, push it back to completion queue with status = 
SUCCESS. If failed or run to into OUT_OF_SPACE, then push it to completion 
queue with status = FAILURE/ OUT_OF_SPACE_RECOVERABLE and rest with status = 
NOT_PROCESSED and return with enqueue count = total # of ops submitted 
originally with burst?
-app assumes all have been enqueued, so it go and dequeue all ops
-on seeing an op with OUT_OF_SPACE_RECOVERABLE, app resubmit a burst of ops 
with call to stream_continue/resume API starting from op which encountered 
OUT_OF_SPACE and others as NOT_PROCESSED with updated input and output buffer?
-repeat until *all* are dequeued with status = SUCCESS or *any* with status = 
FAILURE? If anytime failure is seen, then app start whole processing all over 
again or just drop this burst?!

If all of above is true, then I think we should add another API such as 
rte_comp_enque_single_stream() which will be functional under Feature Flag = 
ALLOW_ENQUEUE_MULTIPLE_STATEFUL_OPS or better name is 
SUPPORT_ENQUEUE_SINGLE_STREAM?!


>>
>>
>>>> Regarding the ordering of ops
>>>> We do force serialization of ops belonging to a stream in STATEFUL
>>>> operation. Related ops do
>>>> not go out of order and are given to available PMDs one at a time.
>>>>
>>>>>> The question is this mode of use useful for real
>>>>>> life applications or would we be just adding complexity? The technical
>>>>>> advantage of this is that processing of Stateful ops is interdependent
>>>>>> and PMDs can take advantage of caching and other optimizations to make
>>>>>> processing related ops much faster than switching on every op. PMDs have
>>>>>> maintain state of more than 32 KB for DEFLATE for every stream.
>>>>>>>> If the application has all the data, it can put it into chained mbufs 
>>>>>>>> in a single
>>>>>>>> op rather than
>>>>>>>> multiple ops, which avoids pushing all that complexity down to the 
>>>>>>>> PMDs.
>>>>>> [Ahmed] I think that your suggested scheme of putting all related mbufs
>>>>>> into one op may be the best solution without the extra complexity of
>>>>>> handling OUT_OF_SPACE cases, while still allowing the enqueuer extra
>>>>>> time If we have a way of marking mbufs as ready for consumption. The
>>>>>> enqueuer may not have all the data at hand but can enqueue the op with a
>>>>>> couple of empty mbus marked as not ready for consumption. The enqueuer
>>>>>> will then update the rest of the mbufs to ready for consumption once the
>>>>>> data is added. This introduces a race condition. A second flag for each
>>>>>> mbuf can be updated by the PMD to indicate that it processed it or not.
>>>>>> This way in cases where the PMD beat the application to the op, the
>>>>>> application will just update the op to point to the first unprocessed
>>>>>> mbuf and resend it to the PMD.
>>>>> [Fiona] This doesn't sound safe. You want to add data to a stream after 
>>>>> you've
>>>>> enqueued the op. You would have to write to op.src.length at a time when 
>>>>> the PMD
>>>>> might be reading it. Sounds like a lock would be necessary.
>>>>> Once the op has been enqueued, my understanding is its ownership is handed
>>>>> over to the PMD and the application should not touch it until it has been 
>>>>> dequeued.
>>>>> I don't think it's a good idea to change this model.
>>>>> Can't the application just collect a stream of data in chained mbufs 
>>>>> until it has
>>>>> enough to send an op, then construct the op and while waiting for that op 
>>>>> to
>>>>> complete, accumulate the next batch of chained mbufs? Only construct the 
>>>>> next op
>>>>> after the previous one is complete, based on the result of the previous 
>>>>> one.
>>>>>
>>>> [Ahmed] Fair enough. I agree with you. I imagined it in a different way
>>>> in which each mbuf would have its own length.
>>>> The advantage to gain is in applications where there is one PMD user,
>>>> the down time between ops can be significant and setting up a single
>>>> producer consumer pair significantly reduces the CPU cycles and PMD down
>>>> time.
>>>>
>>>> ////snip////
>

Re: [dpdk-dev] [RFC v2] doc compression API for DPDK

Reply via email to