Re: [RFC PATCH] lib: scatterlist: add sg splitting function

2015-07-30 Thread Russell King - ARM Linux
On Thu, Jul 30, 2015 at 09:02:15PM +0200, Robert Jarzmik wrote:
 Sometimes a scatter-gather has to be split into several chunks, or sub scatter
 lists. This happens for example if a scatter list will be handled by multiple
 DMA channels, each one filling a part of it.
 
 A concrete example comes with the media V4L2 API, where the scatter list is
 allocated from userspace to hold an image, regardless of the knowledge of how
 many DMAs will fill it :
  - in a simple RGB565 case, one DMA will pump data from the camera ISP to 
 memory
  - in the trickier YUV422 case, 3 DMAs will pump data from the camera ISP 
 pipes,
one for pipe Y, one for pipe U and one for pipe V
 
 For these cases, it is necessary to split the original scatter list into
 multiple scatter lists, which is the purpose of this patch.
 
 The guarantees that are required for this patch are :
  - the intersection of spans of any couple of resulting scatter lists is empty
  - the union of spans of all resulting scatter lists is a subrange of the span
of the original scatter list
  - if streaming DMA API operations (mapping, unmapping) should not happen both
on both the resulting and the original scatter list. It's either the first 
 or
the later ones.

Hmm.

What happens if...

n = dma_map_sg(dev, sg, nents, dir);

where n  nents (which can happen if you have an IOMMU and it coalesces
the entries)?

This also means that sg_dma_len(sg) may not be equal to sg-length, nor
may sg_dma_address(sg) correspond with sg_phys() etc.

 + for_each_sg(in, sg, sg_nents(in), i) {
 + if (skip  sg_dma_len(sg)) {
 + skip -= sg_dma_len(sg);

sg_dma_len() is undefined before the scatterlist is mapped.  Above, you
say that dma map should not happen on both the initial or the subsequently
split scatterlists, but this requires the original to be DMA-mapped.

Also, as I mention above, the number of scatterlist entries to process
is given by 'n' above, not 'nents'.  I'm not sure that there's any
requirement for dma_map_sg() to mark the new end of the scatterlist as
that'd result in information loss when subsequently unmapping.

 + for (i = 0, split = splitters; i  nb_splits; i++, split++) {
 + in_sg = split-in_sg0;
 + out_sg = split-out_sg;
 + out[i] = out_sg;
 + for (j = 0; j  split-nents; j++, out_sg++) {
 + *out_sg = *in_sg;
 + if (!j) {
 + out_sg-offset = split-skip_sg0;
 + sg_dma_len(out_sg) -= split-skip_sg0;
 + } else {
 + out_sg-offset = 0;
 + }
 + in_sg = sg_next(in_sg);
 + }
 + sg_dma_len(--out_sg) = split-len_last_sg;

Hmm, I'm not sure this is good enough.  If we're talking about a mapped
scatterlist, this won't touch the value returned by sg_dma_address() at
all, which will end up being incorrect.

If this is the state of the code in the media subsystem, then it's very
buggy, and in need of fixing.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] lib: scatterlist: add sg splitting function

2015-07-30 Thread Robert Jarzmik
Russell King - ARM Linux li...@arm.linux.org.uk writes:

 On Thu, Jul 30, 2015 at 09:02:15PM +0200, Robert Jarzmik wrote:
 Hmm.

 What happens if...

   n = dma_map_sg(dev, sg, nents, dir);

 where n  nents (which can happen if you have an IOMMU and it coalesces
 the entries)?
That's something I have not thought of.

My first guess is that if the input scatter list is mapped and entries are
coallesced, then :
 - there are sg entries where the sg_dma_length(sg) is the result of
   coallescing, ie. the sum of sg-length, sg_next(sg)-length, ...
 - the entries sg_next(sg)-length and next coallesced have sg_dma_length() of
   0, or are transformed to sg_chain entries, I don't really know.

I must check how this code resists to these cases, and cross-check in
drivers/iommu if my understanding of coallescing is correct.

 This also means that sg_dma_len(sg) may not be equal to sg-length, nor
 may sg_dma_address(sg) correspond with sg_phys() etc.
Yes, I understand that I think, the length being a consequence of coallescing,
the address being a consequence of a dma bus master offsetting physical 
addresses.

 +for_each_sg(in, sg, sg_nents(in), i) {
 +if (skip  sg_dma_len(sg)) {
 +skip -= sg_dma_len(sg);

 sg_dma_len() is undefined before the scatterlist is mapped.  Above, you
 say that dma map should not happen on both the initial or the subsequently
 split scatterlists, but this requires the original to be DMA-mapped.
Ah that means that the mapping has to be done on the original list the way the
current code is designed. That's actually how this code is exercised in the next
version of pxa_camera.c I'm working on.

It's a bit a pity, I thought I had managed a generic splitter to both mapped and
unmapped scatterlists ... I was obviously wrong.

 Also, as I mention above, the number of scatterlist entries to process
 is given by 'n' above, not 'nents'.
Ah yes. So if I add the requirement that the input scatterlist is mapped, I must
also add a parameter to sg_split() with the number of entries.

 I'm not sure that there's any requirement for dma_map_sg() to mark the new end
 of the scatterlist as that'd result in information loss when subsequently
 unmapping.
If it appears that the only viable way is having the input scatterlist mapped,
this sentence is not applicable, right ?

 +for (i = 0, split = splitters; i  nb_splits; i++, split++) {
 +in_sg = split-in_sg0;
 +out_sg = split-out_sg;
 +out[i] = out_sg;
 +for (j = 0; j  split-nents; j++, out_sg++) {
 +*out_sg = *in_sg;
 +if (!j) {
 +out_sg-offset = split-skip_sg0;
 +sg_dma_len(out_sg) -= split-skip_sg0;
 +} else {
 +out_sg-offset = 0;
 +}
 +in_sg = sg_next(in_sg);
 +}
 +sg_dma_len(--out_sg) = split-len_last_sg;

 Hmm, I'm not sure this is good enough.  If we're talking about a mapped
 scatterlist, this won't touch the value returned by sg_dma_address() at
 all, which will end up being incorrect.
Indeed. That means that in all of my tests, I had probably skip_sg0 == 0,
because otherwise the result would be incorrect.

 If this is the state of the code in the media subsystem, then it's very
 buggy, and in need of fixing.

This is the state of the code I submitted to a single media driver,
pxa_camera. It's a consequence of the pxa shift to dmaengine. And I thought it
would be worth exposing it to more users ... or not, that will depend on this
RFC.

And in the specific case of this pxa driver, it kinda works by luck so far. I
felt it when submitting it here.

Thanks for the review Russell, I have enough to think of and good inputs to
submit an RFCv2 in a couple of days. Even if this patch is dropped in the end,
because it's either too restrictive to force to map the input scatterlist, or
because it won't be that usefull, I will have a cleaner code in pxa_camera :)

Cheers.

-- 
Robert
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH] lib: scatterlist: add sg splitting function

2015-07-30 Thread Robert Jarzmik
Sometimes a scatter-gather has to be split into several chunks, or sub scatter
lists. This happens for example if a scatter list will be handled by multiple
DMA channels, each one filling a part of it.

A concrete example comes with the media V4L2 API, where the scatter list is
allocated from userspace to hold an image, regardless of the knowledge of how
many DMAs will fill it :
 - in a simple RGB565 case, one DMA will pump data from the camera ISP to memory
 - in the trickier YUV422 case, 3 DMAs will pump data from the camera ISP pipes,
   one for pipe Y, one for pipe U and one for pipe V

For these cases, it is necessary to split the original scatter list into
multiple scatter lists, which is the purpose of this patch.

The guarantees that are required for this patch are :
 - the intersection of spans of any couple of resulting scatter lists is empty
 - the union of spans of all resulting scatter lists is a subrange of the span
   of the original scatter list
 - if streaming DMA API operations (mapping, unmapping) should not happen both
   on both the resulting and the original scatter list. It's either the first or
   the later ones.
 - the caller is reponsible to call kfree() on the resulting scatterlists

Signed-off-by: Robert Jarzmik robert.jarz...@free.fr

---
The sg_split() function is an attempt to address the splitting in a generic
way. If it is judged unsuitable, it will remain as a specialized function in the
depths of a media driver.

Memo of people to ask:
To: Russell King - ARM Linux li...@arm.linux.org.uk
To: Jens Axboe ax...@kernel.dk
To: Guennadi Liakhovetski g.liakhovet...@gmx.de
To: Andrew Morton a...@linux-foundation.org
To: Mauro Carvalho Chehab mche...@osg.samsung.com
Cc: linux-media@vger.kernel.org
---
 include/linux/scatterlist.h |   3 ++
 lib/scatterlist.c   | 122 
 2 files changed, 125 insertions(+)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 9b1ef0c820a7..cee8648a6918 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -251,6 +251,9 @@ struct scatterlist *sg_next(struct scatterlist *);
 struct scatterlist *sg_last(struct scatterlist *s, unsigned int);
 void sg_init_table(struct scatterlist *, unsigned int);
 void sg_init_one(struct scatterlist *, const void *, unsigned int);
+int sg_split(struct scatterlist *in, const off_t skip, const int nb_splits,
+const size_t *split_sizes, struct scatterlist **out,
+gfp_t gfp_mask);
 
 typedef struct scatterlist *(sg_alloc_fn)(unsigned int, gfp_t);
 typedef void (sg_free_fn)(struct scatterlist *, unsigned int);
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index d105a9f56878..c4415a2af0ec 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -759,3 +759,125 @@ size_t sg_pcopy_to_buffer(struct scatterlist *sgl, 
unsigned int nents,
return sg_copy_buffer(sgl, nents, buf, buflen, skip, true);
 }
 EXPORT_SYMBOL(sg_pcopy_to_buffer);
+
+struct sg_splitter {
+   struct scatterlist *in_sg0;
+   int nents;
+   off_t skip_sg0;
+   size_t len_last_sg;
+   struct scatterlist *out_sg;
+};
+
+static struct sg_splitter *sg_calculate_split(struct scatterlist *in,
+ off_t skip, const size_t *sizes,
+ int nb_splits, gfp_t gfp_mask)
+{
+   int i;
+   size_t size = sizes[0], len;
+   struct sg_splitter *splitters, *curr;
+   struct scatterlist *sg;
+
+   splitters = kcalloc(nb_splits, sizeof(*splitters), gfp_mask);
+   if (!splitters)
+   return NULL;
+
+   curr = splitters;
+   for_each_sg(in, sg, sg_nents(in), i) {
+   if (skip  sg_dma_len(sg)) {
+   skip -= sg_dma_len(sg);
+   continue;
+   }
+   len = min_t(size_t, size, sg_dma_len(sg) - skip);
+   if (!curr-in_sg0) {
+   curr-in_sg0 = sg;
+   curr-skip_sg0 = sg_dma_len(sg) - len;
+   }
+   size -= len;
+   curr-nents++;
+   if (!size) {
+   curr-len_last_sg = len;
+   size = *(++sizes);
+
+   if (!--nb_splits)
+   break;
+
+   curr++;
+   if (len  sg_dma_len(sg) - skip) {
+   curr-in_sg0 = sg;
+   curr-skip_sg0 = sg_dma_len(sg) - skip - len;
+   curr-nents++;
+   }
+   }
+   }
+
+   return splitters;
+}
+
+/**
+ * sg_split - split a scatterlist into several scatterlists
+ * @in: the input sg list
+ * @skip: the number of bytes to skip in the input sg list
+ * @nb_splits: the number of desired sg outputs
+ * @split_sizes: the respective size of each output sg list in bytes
+ * @out: an