commit python-dask for openSUSE:Factory

root Thu, 11 Oct 2018 02:58:36 -0700

Hello community,

here is the log from the commit of package python-dask for openSUSE:Factory 
checked in at 2018-10-11 11:58:21
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/python-dask (Old)
 and      /work/SRC/openSUSE:Factory/.python-dask.new (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Package is "python-dask"

Thu Oct 11 11:58:21 2018 rev:10 rq:640983 version:0.19.4

Changes:
--------
--- /work/SRC/openSUSE:Factory/python-dask/python-dask.changes  2018-10-09 
15:53:29.190331138 +0200
+++ /work/SRC/openSUSE:Factory/.python-dask.new/python-dask.changes     
2018-10-11 11:58:23.913798749 +0200
@@ -1,0 +2,23 @@
+Wed Oct 10 01:49:52 UTC 2018 - Arun Persaud <[email protected]>
+
+- update to version 0.19.4:
+  * Array
+    + Implement apply_gufunc(..., axes=..., keepdims=...) (:pr:`3985`)
+      Markus Gonser
+  * Bag
+    + Fix typo in datasets.make_people (:pr:`4069`) Matthew Rocklin
+  * Dataframe
+    + Added percentiles options for dask.dataframe.describe method
+      (:pr:`4067`) Zhenqing Li
+    + Add DataFrame.partitions accessor similar to Array.blocks
+      (:pr:`4066`) Matthew Rocklin
+  * Core
+    + Pass get functions and Clients through scheduler keyword
+      (:pr:`4062`) Matthew Rocklin
+  * Documentation
+    + Fix Typo on hpc example. (missing = in kwarg). (:pr:`4068`)
+      Matthias Bussonier
+    + Extensive copy-editing: (:pr:`4065`), (:pr:`4064`), (:pr:`4063`)
+      Miguel Farrajota
+
+-------------------------------------------------------------------

Old:
----
  dask-0.19.3.tar.gz

New:
----
  dask-0.19.4.tar.gz

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Other differences:
------------------
++++++ python-dask.spec ++++++
--- /var/tmp/diff_new_pack.Vpd5bP/_old  2018-10-11 11:58:24.925797463 +0200
+++ /var/tmp/diff_new_pack.Vpd5bP/_new  2018-10-11 11:58:24.945797438 +0200
@@ -22,7 +22,7 @@
 # python(2/3)-distributed has a dependency loop with python(2/3)-dask
 %bcond_with     test_distributed
 Name:           python-dask
-Version:        0.19.3
+Version:        0.19.4
 Release:        0
 Summary:        Minimal task scheduling abstraction
 License:        BSD-3-Clause

++++++ dask-0.19.3.tar.gz -> dask-0.19.4.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/PKG-INFO new/dask-0.19.4/PKG-INFO
--- old/dask-0.19.3/PKG-INFO    2018-10-05 20:57:35.000000000 +0200
+++ new/dask-0.19.4/PKG-INFO    2018-10-09 21:27:57.000000000 +0200
@@ -1,12 +1,11 @@
-Metadata-Version: 1.2
+Metadata-Version: 2.1
 Name: dask
-Version: 0.19.3
+Version: 0.19.4
 Summary: Parallel PyData with Task Scheduling
 Home-page: http://github.com/dask/dask/
-Author: Matthew Rocklin
-Author-email: [email protected]
+Maintainer: Matthew Rocklin
+Maintainer-email: [email protected]
 License: BSD
-Description-Content-Type: UNKNOWN
 Description: Dask
         ====
         
@@ -45,3 +44,9 @@
 Classifier: Programming Language :: Python :: 3.6
 Classifier: Programming Language :: Python :: 3.7
 Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*
+Provides-Extra: dataframe
+Provides-Extra: array
+Provides-Extra: bag
+Provides-Extra: distributed
+Provides-Extra: delayed
+Provides-Extra: complete
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/dask/_version.py 
new/dask-0.19.4/dask/_version.py
--- old/dask-0.19.3/dask/_version.py    2018-10-05 20:57:35.000000000 +0200
+++ new/dask-0.19.4/dask/_version.py    2018-10-09 21:27:57.000000000 +0200
@@ -11,8 +11,8 @@
 {
  "dirty": false,
  "error": null,
- "full-revisionid": "2e98e50a9055cab1a5d04d777f4e59702318a0ca",
- "version": "0.19.3"
+ "full-revisionid": "bbae2d8a03b5b018e019f9fd2b90004fe6b601ac",
+ "version": "0.19.4"
 }
 '''  # END VERSION_JSON
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/dask/array/gufunc.py 
new/dask-0.19.4/dask/array/gufunc.py
--- old/dask-0.19.3/dask/array/gufunc.py        2018-09-26 23:49:35.000000000 
+0200
+++ new/dask-0.19.4/dask/array/gufunc.py        2018-10-09 21:03:29.000000000 
+0200
@@ -55,6 +55,100 @@
     return ins, outs
 
 
+def _validate_normalize_axes(axes, axis, keepdims, input_coredimss, 
output_coredimss):
+    """
+    Validates logic of `axes`/`axis`/`keepdims` arguments and normalize them.
+    Refer to [1]_ for details
+
+    Arguments
+    ---------
+    axes: List of tuples
+    axis: int
+    keepdims: bool
+    input_coredimss: List of Tuple of dims
+    output_coredimss: List of Tuple of dims
+
+    Returns
+    -------
+    input_axes: List of tuple of int
+    output_axes: List of tuple of int
+
+    References
+    ----------
+    .. [1] 
https://docs.scipy.org/doc/numpy/reference/ufuncs.html#optional-keyword-arguments
+    """
+    nin = len(input_coredimss)
+    nout = 1 if not isinstance(output_coredimss, list) else 
len(output_coredimss)
+
+    if axes is not None and axis is not None:
+        raise ValueError("Only one of `axis` or `axes` keyword arguments 
should be given")
+    if axes and not isinstance(axes, list):
+        raise ValueError("`axes` has to be of type list")
+
+    output_coredimss = output_coredimss if nout > 1 else [output_coredimss]
+    filtered_core_dims = list(filter(len, input_coredimss))
+    nr_outputs_with_coredims = len([True for x in output_coredimss if len(x) > 
0])
+
+    if keepdims:
+        if nr_outputs_with_coredims > 0:
+            raise ValueError("`keepdims` can only be used for scalar outputs")
+        output_coredimss = len(output_coredimss) * [filtered_core_dims[0]]
+
+    core_dims = input_coredimss + output_coredimss
+    if axis is not None:
+        if not isinstance(axis, int):
+            raise ValueError("`axis` argument has to be an integer value")
+        if filtered_core_dims:
+            cd0 = filtered_core_dims[0]
+            if len(cd0) != 1:
+                raise ValueError("`axis` can be used only, if one core 
dimension is present")
+            for cd in filtered_core_dims:
+                if cd0 != cd:
+                    raise ValueError("To use `axis`, all core dimensions have 
to be equal")
+
+    # Expand dafaults or axis
+    if axes is None:
+        if axis is not None:
+            axes = [(axis,) if cd else tuple() for cd in core_dims]
+        else:
+            axes = [tuple(range(-len(icd), 0)) for icd in core_dims]
+    elif not isinstance(axes, list):
+        raise ValueError("`axes` argument has to be a list")
+    axes = [(a,) if isinstance(a, int) else a for a in axes]
+
+    if (((nr_outputs_with_coredims == 0) and (nin != len(axes)) and (nin + 
nout != len(axes))) or
+            ((nr_outputs_with_coredims > 0) and (nin + nout != len(axes)))):
+        raise ValueError("The number of `axes` entries is not equal the number 
of input and output arguments")
+
+    # Treat outputs
+    output_axes = axes[nin:]
+    output_axes = output_axes if output_axes else [tuple(range(-len(ocd), 0)) 
for ocd in output_coredimss]
+    input_axes = axes[:nin]
+
+    # Assert we have as many axes as output core dimensions
+    for idx, (iax, icd) in enumerate(zip(input_axes, input_coredimss)):
+        if len(iax) != len(icd):
+            raise ValueError("The number of `axes` entries for argument #{} is 
not equal "
+                             "the number of respective input core dimensions 
in signature"
+                             .format(idx))
+    if not keepdims:
+        for idx, (oax, ocd) in enumerate(zip(output_axes, output_coredimss)):
+            if len(oax) != len(ocd):
+                raise ValueError("The number of `axes` entries for argument 
#{} is not equal "
+                                 "the number of respective output core 
dimensions in signature"
+                                 .format(idx))
+    else:
+        if input_coredimss:
+            icd0 = input_coredimss[0]
+            for icd in input_coredimss:
+                if icd0 != icd:
+                    raise ValueError("To use `keepdims`, all core dimensions 
have to be equal")
+            iax0 = input_axes[0]
+            output_axes = [iax0 for _ in output_coredimss]
+
+    return input_axes, output_axes
+
+
 def apply_gufunc(func, signature, *args, **kwargs):
     """
     Apply a generalized ufunc or similar python function to arrays.
@@ -83,6 +177,30 @@
         According to the specification of numpy.gufunc signature [2]_
     *args : numeric
         Input arrays or scalars to the callable function.
+    axes: List of tuples, optional, keyword only
+        A list of tuples with indices of axes a generalized ufunc should 
operate on.
+        For instance, for a signature of ``"(i,j),(j,k)->(i,k)"`` appropriate 
for
+        matrix multiplication, the base elements are two-dimensional matrices
+        and these are taken to be stored in the two last axes of each 
argument. The
+        corresponding axes keyword would be ``[(-2, -1), (-2, -1), (-2, -1)]``.
+        For simplicity, for generalized ufuncs that operate on 1-dimensional 
arrays
+        (vectors), a single integer is accepted instead of a single-element 
tuple,
+        and for generalized ufuncs for which all outputs are scalars, the 
output
+        tuples can be omitted.
+    axis: int, optional, keyword only
+        A single axis over which a generalized ufunc should operate. This is a 
short-cut
+        for ufuncs that operate over a single, shared core dimension, 
equivalent to passing
+        in axes with entries of (axis,) for each single-core-dimension 
argument and ``()`` for
+        all others. For instance, for a signature ``"(i),(i)->()"``, it is 
equivalent to passing
+        in ``axes=[(axis,), (axis,), ()]``.
+    keepdims: bool, optional, keyword only
+        If this is set to True, axes which are reduced over will be left in 
the result as
+        a dimension with size one, so that the result will broadcast correctly 
against the
+        inputs. This option can only be used for generalized ufuncs that 
operate on inputs
+        that all have the same number of core dimensions and with outputs that 
have no core
+        dimensions , i.e., with signatures like ``"(i),(i)->()"`` or 
``"(m,m)->()"``.
+        If used, the location of the dimensions in the output can be 
controlled with axes
+        and axis.
     output_dtypes : Optional, dtype or list of dtypes, keyword only
         Valid numpy dtype specification or list thereof.
         If not given, a call of ``func`` with a small set of data
@@ -113,7 +231,7 @@
     >>> def stats(x):
     ...     return np.mean(x, axis=-1), np.std(x, axis=-1)
     >>> a = da.random.normal(size=(10,20,30), chunks=(5, 10, 30))
-    >>> mean, std = da.apply_gufunc(stats, "(i)->(),()", a, 
output_dtypes=2*(a.dtype,))
+    >>> mean, std = da.apply_gufunc(stats, "(i)->(),()", a)
     >>> mean.compute().shape
     (10, 20)
 
@@ -122,7 +240,7 @@
     ...     return np.einsum("i,j->ij", x, y)
     >>> a = da.random.normal(size=(   20,30), chunks=(10, 30))
     >>> b = da.random.normal(size=(10, 1,40), chunks=(5, 1, 40))
-    >>> c = da.apply_gufunc(outer_product, "(i),(j)->(i,j)", a, b, 
output_dtypes=a.dtype, vectorize=True)
+    >>> c = da.apply_gufunc(outer_product, "(i),(j)->(i,j)", a, b, 
vectorize=True)
     >>> c.compute().shape
     (10, 20, 30, 40)
 
@@ -131,6 +249,9 @@
     .. [1] http://docs.scipy.org/doc/numpy/reference/ufuncs.html
     .. [2] 
http://docs.scipy.org/doc/numpy/reference/c-api.generalized-ufuncs.html
     """
+    axes = kwargs.pop("axes", None)
+    axis = kwargs.pop("axis", None)
+    keepdims = kwargs.pop("keepdims", False)
     output_dtypes = kwargs.pop("output_dtypes", None)
     output_sizes = kwargs.pop("output_sizes", None)
     vectorize = kwargs.pop("vectorize", None)
@@ -140,14 +261,18 @@
     ## Signature
     if not isinstance(signature, str):
         raise TypeError('`signature` has to be of type string')
-    core_input_dimss, core_output_dimss = _parse_gufunc_signature(signature)
+    input_coredimss, output_coredimss = _parse_gufunc_signature(signature)
 
     ## Determine nout: nout = None for functions of one direct return; nout = 
int for return tuples
-    nout = None if not isinstance(core_output_dimss, list) else 
len(core_output_dimss)
+    nout = None if not isinstance(output_coredimss, list) else 
len(output_coredimss)
 
     ## Determine and handle output_dtypes
     if output_dtypes is None:
-        output_dtypes = apply_infer_dtype(func, args, kwargs, "apply_gufunc", 
"output_dtypes", nout)
+        if vectorize:
+            tempfunc = np.vectorize(func, signature=signature)
+        else:
+            tempfunc = func
+        output_dtypes = apply_infer_dtype(tempfunc, args, kwargs, 
"apply_gufunc", "output_dtypes", nout)
 
     if isinstance(output_dtypes, (tuple, list)):
         if nout is None:
@@ -171,26 +296,41 @@
     if output_sizes is None:
         output_sizes = {}
 
+    ## Axes
+    input_axes, output_axes = _validate_normalize_axes(axes, axis, keepdims, 
input_coredimss, output_coredimss)
+
     # Main code:
     ## Cast all input arrays to dask
     args = [asarray(a) for a in args]
 
-    if len(core_input_dimss) != len(args):
+    if len(input_coredimss) != len(args):
         ValueError("According to `signature`, `func` requires %d arguments, 
but %s given"
-                   % (len(core_output_dimss), len(args)))
+                   % (len(input_coredimss), len(args)))
+
+    ## Axes: transpose input arguments
+    transposed_args = []
+    for arg, iax, input_coredims in zip(args, input_axes, input_coredimss):
+        shape = arg.shape
+        iax = tuple(a if a < 0 else a - len(shape) for a in iax)
+        tidc = tuple(i for i in range(-len(shape) + 0, 0) if i not in iax) + 
iax
+
+        transposed_arg = arg.transpose(tidc)
+        transposed_args.append(transposed_arg)
+    args = transposed_args
 
     ## Assess input args for loop dims
     input_shapes = [a.shape for a in args]
     input_chunkss = [a.chunks for a in args]
-    num_loopdims = [len(s) - len(cd) for s, cd in zip(input_shapes, 
core_input_dimss)]
+    num_loopdims = [len(s) - len(cd) for s, cd in zip(input_shapes, 
input_coredimss)]
     max_loopdims = max(num_loopdims) if num_loopdims else None
-    _core_input_shapes = [dict(zip(cid, s[n:])) for s, n, cid in 
zip(input_shapes, num_loopdims, core_input_dimss)]
-    core_shapes = merge(output_sizes, *_core_input_shapes)
+    core_input_shapes = [dict(zip(icd, s[n:])) for s, n, icd in 
zip(input_shapes, num_loopdims, input_coredimss)]
+    core_shapes = merge(*core_input_shapes)
+    core_shapes.update(output_sizes)
 
     loop_input_dimss = [tuple("__loopdim%d__" % d for d in range(max_loopdims 
- n, max_loopdims)) for n in num_loopdims]
-    input_dimss = [l + c for l, c in zip(loop_input_dimss, core_input_dimss)]
+    input_dimss = [l + c for l, c in zip(loop_input_dimss, input_coredimss)]
 
-    loop_output_dims = max(loop_input_dimss, key=len) if loop_input_dimss else 
set()
+    loop_output_dims = max(loop_input_dimss, key=len) if loop_input_dimss else 
tuple()
 
     ## Assess input args for same size and chunk sizes
     ### Collect sizes and chunksizes of all dims in all arrays
@@ -198,12 +338,12 @@
     chunksizess = {}
     for dims, shape, chunksizes in zip(input_dimss, input_shapes, 
input_chunkss):
         for dim, size, chunksize in zip(dims, shape, chunksizes):
-            _dimsizes = dimsizess.get(dim, [])
-            _dimsizes.append(size)
-            dimsizess[dim] = _dimsizes
-            _chunksizes = chunksizess.get(dim, [])
-            _chunksizes.append(chunksize)
-            chunksizess[dim] = _chunksizes
+            dimsizes = dimsizess.get(dim, [])
+            dimsizes.append(size)
+            dimsizess[dim] = dimsizes
+            chunksizes_ = chunksizess.get(dim, [])
+            chunksizes_.append(chunksize)
+            chunksizess[dim] = chunksizes_
     ### Assert correct partitioning, for case:
     for dim, sizes in dimsizess.items():
         #### Check that the arrays have same length for same dimensions or 
dimension `1`
@@ -237,19 +377,18 @@
     loop_output_chunks = tmp.chunks
     dsk = tmp.__dask_graph__()
     keys = list(flatten(tmp.__dask_keys__()))
-    _anykey = keys[0]
-    name, token = _anykey[0].split('-')
+    name, token = keys[0][0].split('-')
 
     ### *) Treat direct output
     if nout is None:
-        core_output_dimss = [core_output_dimss]
+        output_coredimss = [output_coredimss]
         output_dtypes = [output_dtypes]
 
     ## Split output
     leaf_arrs = []
-    for i, cod, odt in zip(count(0), core_output_dimss, output_dtypes):
-        core_output_shape = tuple(core_shapes[d] for d in cod)
-        core_chunkinds = len(cod) * (0,)
+    for i, ocd, odt, oax in zip(count(0), output_coredimss, output_dtypes, 
output_axes):
+        core_output_shape = tuple(core_shapes[d] for d in ocd)
+        core_chunkinds = len(ocd) * (0,)
         output_shape = loop_output_shape + core_output_shape
         output_chunks = loop_output_chunks + core_output_shape
         leaf_name = "%s_%d-%s" % (name, i, token)
@@ -259,6 +398,21 @@
                          chunks=output_chunks,
                          shape=output_shape,
                          dtype=odt)
+
+        ### Axes:
+        if keepdims:
+            slices = len(leaf_arr.shape) * (slice(None),) + len(oax) * 
(np.newaxis,)
+            leaf_arr = leaf_arr[slices]
+
+        tidcs = [None] * len(leaf_arr.shape)
+        for i, oa in zip(range(-len(oax), 0), oax):
+            tidcs[oa] = i
+        j = 0
+        for i in range(len(tidcs)):
+            if tidcs[i] is None:
+                tidcs[i] = j
+                j += 1
+        leaf_arr = leaf_arr.transpose(tidcs)
         leaf_arrs.append(leaf_arr)
 
     return leaf_arrs if nout else leaf_arrs[0]  # Undo *) from above
@@ -281,8 +435,35 @@
     signature : String, keyword only
         Specifies what core dimensions are consumed and produced by ``func``.
         According to the specification of numpy.gufunc signature [2]_
-    output_dtypes : dtype or list of dtypes, keyword only
-        dtype or list of output dtypes.
+    axes: List of tuples, optional, keyword only
+        A list of tuples with indices of axes a generalized ufunc should 
operate on.
+        For instance, for a signature of ``"(i,j),(j,k)->(i,k)"`` appropriate 
for
+        matrix multiplication, the base elements are two-dimensional matrices
+        and these are taken to be stored in the two last axes of each 
argument. The
+        corresponding axes keyword would be ``[(-2, -1), (-2, -1), (-2, -1)]``.
+        For simplicity, for generalized ufuncs that operate on 1-dimensional 
arrays
+        (vectors), a single integer is accepted instead of a single-element 
tuple,
+        and for generalized ufuncs for which all outputs are scalars, the 
output
+        tuples can be omitted.
+    axis: int, optional, keyword only
+        A single axis over which a generalized ufunc should operate. This is a 
short-cut
+        for ufuncs that operate over a single, shared core dimension, 
equivalent to passing
+        in axes with entries of (axis,) for each single-core-dimension 
argument and ``()`` for
+        all others. For instance, for a signature ``"(i),(i)->()"``, it is 
equivalent to passing
+        in ``axes=[(axis,), (axis,), ()]``.
+    keepdims: bool, optional, keyword only
+        If this is set to True, axes which are reduced over will be left in 
the result as
+        a dimension with size one, so that the result will broadcast correctly 
against the
+        inputs. This option can only be used for generalized ufuncs that 
operate on inputs
+        that all have the same number of core dimensions and with outputs that 
have no core
+        dimensions , i.e., with signatures like ``"(i),(i)->()"`` or 
``"(m,m)->()"``.
+        If used, the location of the dimensions in the output can be 
controlled with axes
+        and axis.
+    output_dtypes : Optional, dtype or list of dtypes, keyword only
+        Valid numpy dtype specification or list thereof.
+        If not given, a call of ``func`` with a small set of data
+        is performed in order to try to  automatically determine the
+        output dtypes.
     output_sizes : dict, optional, keyword only
         Optional mapping from dimension names to sizes for outputs. Only used 
if
         new core dimensions (not found on inputs) appear on outputs.
@@ -330,6 +511,9 @@
         self.pyfunc = pyfunc
         self.signature = kwargs.pop("signature", None)
         self.vectorize = kwargs.pop("vectorize", False)
+        self.axes = kwargs.pop("axes", None)
+        self.axis = kwargs.pop("axis", None)
+        self.keepdims = kwargs.pop("keepdims", False)
         self.output_sizes = kwargs.pop("output_sizes", None)
         self.output_dtypes = kwargs.pop("output_dtypes", None)
         self.allow_rechunk = kwargs.pop("allow_rechunk", False)
@@ -359,6 +543,9 @@
                             self.signature,
                             *args,
                             vectorize=self.vectorize,
+                            axes=self.axes,
+                            axis=self.axis,
+                            keepdims=self.keepdims,
                             output_sizes=self.output_sizes,
                             output_dtypes=self.output_dtypes,
                             allow_rechunk=self.allow_rechunk or 
kwargs.pop("allow_rechunk", False),
@@ -374,8 +561,35 @@
     signature : String
         Specifies what core dimensions are consumed and produced by ``func``.
         According to the specification of numpy.gufunc signature [2]_
-    output_dtypes : dtype or list of dtypes, keyword only
-        dtype or list of output dtypes.
+    axes: List of tuples, optional, keyword only
+        A list of tuples with indices of axes a generalized ufunc should 
operate on.
+        For instance, for a signature of ``"(i,j),(j,k)->(i,k)"`` appropriate 
for
+        matrix multiplication, the base elements are two-dimensional matrices
+        and these are taken to be stored in the two last axes of each 
argument. The
+        corresponding axes keyword would be ``[(-2, -1), (-2, -1), (-2, -1)]``.
+        For simplicity, for generalized ufuncs that operate on 1-dimensional 
arrays
+        (vectors), a single integer is accepted instead of a single-element 
tuple,
+        and for generalized ufuncs for which all outputs are scalars, the 
output
+        tuples can be omitted.
+    axis: int, optional, keyword only
+        A single axis over which a generalized ufunc should operate. This is a 
short-cut
+        for ufuncs that operate over a single, shared core dimension, 
equivalent to passing
+        in axes with entries of (axis,) for each single-core-dimension 
argument and ``()`` for
+        all others. For instance, for a signature ``"(i),(i)->()"``, it is 
equivalent to passing
+        in ``axes=[(axis,), (axis,), ()]``.
+    keepdims: bool, optional, keyword only
+        If this is set to True, axes which are reduced over will be left in 
the result as
+        a dimension with size one, so that the result will broadcast correctly 
against the
+        inputs. This option can only be used for generalized ufuncs that 
operate on inputs
+        that all have the same number of core dimensions and with outputs that 
have no core
+        dimensions , i.e., with signatures like ``"(i),(i)->()"`` or 
``"(m,m)->()"``.
+        If used, the location of the dimensions in the output can be 
controlled with axes
+        and axis.
+    output_dtypes : Optional, dtype or list of dtypes, keyword only
+        Valid numpy dtype specification or list thereof.
+        If not given, a call of ``func`` with a small set of data
+        is performed in order to try to  automatically determine the
+        output dtypes.
     output_sizes : dict, optional, keyword only
         Optional mapping from dimension names to sizes for outputs. Only used 
if
         new core dimensions (not found on inputs) appear on outputs.
@@ -418,7 +632,7 @@
     .. [1] http://docs.scipy.org/doc/numpy/reference/ufuncs.html
     .. [2] 
http://docs.scipy.org/doc/numpy/reference/c-api.generalized-ufuncs.html
     """
-    _allowedkeys = {"vectorize", "output_sizes", "output_dtypes", 
"allow_rechunk"}
+    _allowedkeys = {"vectorize", "axes", "axis", "keepdims", "output_sizes", 
"output_dtypes", "allow_rechunk"}
     if set(_allowedkeys).issubset(kwargs.keys()):
         raise TypeError("Unsupported keyword argument(s) provided")
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/dask/array/tests/test_gufunc.py 
new/dask-0.19.4/dask/array/tests/test_gufunc.py
--- old/dask-0.19.3/dask/array/tests/test_gufunc.py     2018-09-26 
23:49:35.000000000 +0200
+++ new/dask-0.19.4/dask/array/tests/test_gufunc.py     2018-10-09 
21:03:29.000000000 +0200
@@ -8,7 +8,7 @@
 import numpy as np
 
 from dask.array.core import Array
-from dask.array.gufunc import _parse_gufunc_signature, apply_gufunc,gufunc, 
as_gufunc
+from dask.array.gufunc import _parse_gufunc_signature, 
_validate_normalize_axes, apply_gufunc,gufunc, as_gufunc
 
 
 # Copied from `numpy.lib.test_test_function_base.py`:
@@ -34,6 +34,80 @@
         _parse_gufunc_signature('(x)->(x),')
 
 
+def test_apply_gufunc_axes_input_validation_01():
+    def foo(x):
+        return np.mean(x, axis=-1)
+
+    a = da.random.normal(size=(20, 30), chunks=30)
+
+    with pytest.raises(ValueError):
+        apply_gufunc(foo, "(i)->()", a, axes=0)
+
+    apply_gufunc(foo, "(i)->()", a, axes=[0])
+    apply_gufunc(foo, "(i)->()", a, axes=[(0,)])
+    apply_gufunc(foo, "(i)->()", a, axes=[0, tuple()])
+    apply_gufunc(foo, "(i)->()", a, axes=[(0,), tuple()])
+
+    with pytest.raises(ValueError):
+        apply_gufunc(foo, "(i)->()", a, axes=[(0, 1)])
+
+    with pytest.raises(ValueError):
+        apply_gufunc(foo, "(i)->()", a, axes=[0, 0])
+
+
+def test__validate_normalize_axes_01():
+    with pytest.raises(ValueError):
+        _validate_normalize_axes([(1, 0)], None, False, [('i', 'j')], ('j',))
+
+    with pytest.raises(ValueError):
+        _validate_normalize_axes([0, 0], None, False, [('i', 'j')], ('j',))
+
+    with pytest.raises(ValueError):
+        _validate_normalize_axes([(0,), 0], None, False, [('i', 'j')], ('j',))
+
+    i, o = _validate_normalize_axes([(1, 0), 0], None, False, [('i', 'j')], 
('j',))
+    assert i == [(1, 0)]
+    assert o == [(0,)]
+
+
+def test__validate_normalize_axes_02():
+    i, o = _validate_normalize_axes(None, 0, False, [('i', ), ('i', )], ())
+    assert i == [(0,), (0,)]
+    assert o == [()]
+
+    i, o = _validate_normalize_axes(None, 0, False, [('i',)], ('i',))
+    assert i == [(0,)]
+    assert o == [(0,)]
+
+    i, o = _validate_normalize_axes(None, 0, True, [('i',), ('i',)], ())
+    assert i == [(0,), (0,)]
+    assert o == [(0,)]
+
+    with pytest.raises(ValueError):
+        _validate_normalize_axes(None, (0,), False, [('i',), ('i',)], ())
+
+    with pytest.raises(ValueError):
+        _validate_normalize_axes(None, 0, False, [('i',), ('j',)], ())
+
+    with pytest.raises(ValueError):
+        _validate_normalize_axes(None, 0, False, [('i',), ('j',)], ('j',))
+
+
+def test__validate_normalize_axes_03():
+    i, o = _validate_normalize_axes(None, 0, True, [('i',)], ())
+    assert i == [(0,)]
+    assert o == [(0,)]
+
+    with pytest.raises(ValueError):
+        _validate_normalize_axes(None, 0, True, [('i',)], ('i',))
+
+    with pytest.raises(ValueError):
+        _validate_normalize_axes([(0, 1), (0, 1)], None, True, [('i', 'j')], 
('i', 'j'))
+
+    with pytest.raises(ValueError):
+        _validate_normalize_axes([(0,), (0,)], None, True, [('i',), ('j',)], 
())
+
+
 def test_apply_gufunc_01():
     def stats(x):
         return np.mean(x, axis=-1), np.std(x, axis=-1)
@@ -223,7 +297,7 @@
 
     def foo(x):
         return np.mean(x, axis=-1)
-    gufoo = gufunc(foo, signature="(i)->()", output_dtypes=float, 
vectorize=True)
+    gufoo = gufunc(foo, signature="(i)->()", axis=-1, keepdims=False, 
output_dtypes=float, vectorize=True)
 
     y = gufoo(x)
     valy = y.compute()
@@ -236,7 +310,7 @@
 def test_as_gufunc():
     x = da.random.normal(size=(10, 5), chunks=(2, 5))
 
-    @as_gufunc("(i)->()", output_dtypes=float, vectorize=True)
+    @as_gufunc("(i)->()", axis=-1, keepdims=False, output_dtypes=float, 
vectorize=True)
     def foo(x):
         return np.mean(x, axis=-1)
 
@@ -336,3 +410,149 @@
 
     assert_eq(z0, dx + dy)
     assert_eq(z1, dx - dy)
+
+
[email protected]('keepdims', [False, True])
+def test_apply_gufunc_axis_01(keepdims):
+    def mymedian(x):
+        return np.median(x, axis=-1)
+
+    a = np.random.randn(10, 5)
+    da_ = da.from_array(a, chunks=2)
+
+    m = np.median(a, axis=0, keepdims=keepdims)
+    dm = apply_gufunc(mymedian, "(i)->()", da_, axis=0, keepdims=keepdims, 
allow_rechunk=True)
+    assert_eq(m, dm)
+
+
+def test_apply_gufunc_axis_02():
+    def myfft(x):
+        return np.fft.fft(x, axis=-1)
+
+    a = np.random.randn(10, 5)
+    da_ = da.from_array(a, chunks=2)
+
+    m = np.fft.fft(a, axis=0)
+    dm = apply_gufunc(myfft, "(i)->(i)", da_, axis=0, allow_rechunk=True)
+    assert_eq(m, dm)
+
+
+def test_apply_gufunc_axis_02b():
+    def myfilter(x, cn=10, axis=-1):
+        y = np.fft.fft(x, axis=axis)
+        y[cn:-cn] = 0
+        nx = np.fft.ifft(y, axis=axis)
+        return np.real(nx)
+
+    a = np.random.randn(3, 6, 4)
+    da_ = da.from_array(a, chunks=2)
+
+    m = myfilter(a, axis=1)
+    dm = apply_gufunc(myfilter, "(i)->(i)", da_, axis=1, allow_rechunk=True)
+    assert_eq(m, dm)
+
+
+def test_apply_gufunc_axis_03():
+    def mydiff(x):
+        return np.diff(x, axis=-1)
+
+    a = np.random.randn(3, 6, 4)
+    da_ = da.from_array(a, chunks=2)
+
+    m = np.diff(a, axis=1)
+    dm = apply_gufunc(mydiff, "(i)->(i)", da_, axis=1, output_sizes={'i': 5}, 
allow_rechunk=True)
+    assert_eq(m, dm)
+
+
[email protected]('axis', [-2, -1, None])
+def test_apply_gufunc_axis_keepdims(axis):
+    def mymedian(x):
+        return np.median(x, axis=-1)
+
+    a = np.random.randn(10, 5)
+    da_ = da.from_array(a, chunks=2)
+
+    m = np.median(a, axis=-1 if not axis else axis, keepdims=True)
+    dm = apply_gufunc(mymedian, "(i)->()", da_, axis=axis, keepdims=True, 
allow_rechunk=True)
+    assert_eq(m, dm)
+
+
[email protected]('axes', [[0, 1], [(0,), (1,)]])
+def test_apply_gufunc_axes_01(axes):
+    def mystats(x, y):
+        return np.std(x, axis=-1) * np.mean(y, axis=-1)
+
+    a = np.random.randn(10, 5)
+    b = np.random.randn(5, 6)
+    da_ = da.from_array(a, chunks=2)
+    db_ = da.from_array(b, chunks=2)
+
+    m = np.std(a, axis=0) * np.mean(b, axis=1)
+    dm = apply_gufunc(mystats, "(i),(j)->()", da_, db_, axes=axes, 
allow_rechunk=True)
+    assert_eq(m, dm)
+
+
+def test_apply_gufunc_axes_02():
+    def matmul(x, y):
+        return np.einsum("...ij,...jk->...ik", x, y)
+
+    a = np.random.randn(3, 2, 1)
+    b = np.random.randn(3, 7, 5)
+
+    da_ = da.from_array(a, chunks=2)
+    db = da.from_array(b, chunks=3)
+
+    m = np.einsum("jiu,juk->uik", a, b)
+    dm = apply_gufunc(matmul, "(i,j),(j,k)->(i,k)", da_, db, axes=[(1,  0), 
(0, -1), (-2, -1)], allow_rechunk=True)
+    assert_eq(m, dm)
+
+
[email protected](LooseVersion(np.__version__) < '1.12.0',
+                    reason="`np.vectorize(..., signature=...)` not supported 
yet")
+def test_apply_gufunc_axes_two_kept_coredims():
+    a = da.random.normal(size=(   20, 30), chunks=(10, 30))
+    b = da.random.normal(size=(10, 1, 40), chunks=(5, 1, 40))
+
+    def outer_product(x, y):
+        return np.einsum("i,j->ij", x, y)
+
+    c = apply_gufunc(outer_product, "(i),(j)->(i,j)", a, b, vectorize=True)
+    assert c.compute().shape == (10, 20, 30, 40)
+
+
[email protected](LooseVersion(np.__version__) < '1.12.0',
+                    reason="Additional kwargs for this version not supported")
+def test_apply_gufunc_via_numba_01():
+    numba = pytest.importorskip('numba')
+
+    @numba.guvectorize([(numba.float64[:], numba.float64[:], 
numba.float64[:])], '(n),(n)->(n)')
+    def g(x, y, res):
+        for i in range(x.shape[0]):
+            res[i] = x[i] + y[i]
+
+    a = da.random.normal(size=(20, 30), chunks=30)
+    b = da.random.normal(size=(20, 30), chunks=30)
+
+    x = a + b
+    y = g(a, b, axis=0)
+
+    assert_eq(x, y)
+
+
[email protected](LooseVersion(np.__version__) < '1.12.0',
+                    reason="Additional kwargs for this version not supported")
+def test_apply_gufunc_via_numba_02():
+    numba = pytest.importorskip('numba')
+
+    @numba.guvectorize([(numba.float64[:], numba.float64[:])], '(n)->()')
+    def mysum(x, res):
+        res[0] = 0.
+        for i in range(x.shape[0]):
+            res[0] += x[i]
+
+    a = da.random.normal(size=(20, 30), chunks=5)
+
+    x = a.sum(axis=0, keepdims=True)
+    y = mysum(a, axis=0, keepdims=True, allow_rechunk=True)
+
+    assert_eq(x, y)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/dask/base.py new/dask-0.19.4/dask/base.py
--- old/dask-0.19.3/dask/base.py        2018-10-05 20:48:56.000000000 +0200
+++ new/dask-0.19.4/dask/base.py        2018-10-09 18:48:12.000000000 +0200
@@ -825,10 +825,13 @@
     else:
         if get in named_schedulers.values():
             _warnned_on_get[0] = True
-            warnings.warn("The get= keyword has been deprecated. "
-                          "Please use the scheduler= keyword instead with the "
-                          "name of the desired scheduler "
-                          "like 'threads' or 'processes'")
+            warnings.warn(
+                "The get= keyword has been deprecated. "
+                "Please use the scheduler= keyword instead with the name of "
+                "the desired scheduler like 'threads' or 'processes'\n"
+                "    x.compute(scheduler='threads') \n"
+                "or with a function that takes the graph and keys\n"
+                "    x.compute(scheduler=my_scheduler_function)")
 
 
 def get_scheduler(get=None, scheduler=None, collections=None, cls=None):
@@ -851,7 +854,11 @@
         return get
 
     if scheduler is not None:
-        if scheduler.lower() in named_schedulers:
+        if callable(scheduler):
+            return scheduler
+        elif "Client" in type(scheduler).__name__ and hasattr(scheduler, 
'get'):
+            return scheduler.get
+        elif scheduler.lower() in named_schedulers:
             return named_schedulers[scheduler.lower()]
         elif scheduler.lower() in ('dask.distributed', 'distributed'):
             from distributed.worker import get_client
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/dask/dataframe/core.py 
new/dask-0.19.4/dask/dataframe/core.py
--- old/dask-0.19.3/dask/dataframe/core.py      2018-10-05 20:48:56.000000000 
+0200
+++ new/dask-0.19.4/dask/dataframe/core.py      2018-10-09 21:03:29.000000000 
+0200
@@ -20,7 +20,7 @@
 from .. import array as da
 from .. import core
 
-from ..utils import partial_by_order, Dispatch
+from ..utils import partial_by_order, Dispatch, IndexCallable
 from .. import threaded
 from ..compatibility import apply, operator_div, bind_method, string_types, 
Iterator
 from ..context import globalmethod
@@ -896,10 +896,47 @@
         """ Purely label-location based indexer for selection by label.
 
         >>> df.loc["b"]  # doctest: +SKIP
-        >>> df.loc["b":"d"]  # doctest: +SKIP"""
+        >>> df.loc["b":"d"]  # doctest: +SKIP
+        """
         from .indexing import _LocIndexer
         return _LocIndexer(self)
 
+    def _partitions(self, index):
+        if not isinstance(index, tuple):
+            index = (index,)
+        from ..array.slicing import normalize_index
+        index = normalize_index(index, (self.npartitions,))
+        index = tuple(slice(k, k + 1) if isinstance(k, Number) else k
+                      for k in index)
+        name = 'blocks-' + tokenize(self, index)
+        new_keys = np.array(self.__dask_keys__(), dtype=object)[index].tolist()
+
+        divisions = [self.divisions[i] for _, i in new_keys] + 
[self.divisions[new_keys[-1][1] + 1]]
+        dsk = {(name, i): tuple(key) for i, key in enumerate(new_keys)}
+
+        return new_dd_object(merge(dsk, self.dask), name, self._meta, 
divisions)
+
+    @property
+    def partitions(self):
+        """ Slice dataframe by partitions
+
+        This allows partitionwise slicing of a Dask Dataframe.  You can 
perform normal
+        Numpy-style slicing but now rather than slice elements of the array you
+        slice along partitions so, for example, ``df.partitions[:5]`` produces 
a new
+        Dask Dataframe of the first five partitions.
+
+        Examples
+        --------
+        >>> df.partitions[0]  # doctest: +SKIP
+        >>> df.partitions[:3]  # doctest: +SKIP
+        >>> df.partitions[::10]  # doctest: +SKIP
+
+        Returns
+        -------
+        A Dask DataFrame
+        """
+        return IndexCallable(self._partitions)
+
     # Note: iloc is implemented only on DataFrame
 
     def repartition(self, divisions=None, npartitions=None, freq=None, 
force=False):
@@ -1458,19 +1495,22 @@
                 return DataFrame(dask, keyname, meta, quantiles[0].divisions)
 
     @derived_from(pd.DataFrame)
-    def describe(self, split_every=False):
+    def describe(self, split_every=False, percentiles=None):
         # currently, only numeric describe is supported
         num = self._get_numeric_data()
         if self.ndim == 2 and len(num.columns) == 0:
             raise ValueError("DataFrame contains only non-numeric data.")
         elif self.ndim == 1 and self.dtype == 'object':
             raise ValueError("Cannot compute ``describe`` on object dtype.")
-
+        if percentiles is None:
+            percentiles = [0.25, 0.5, 0.75]
+        else:
+            percentiles = list(set(sorted(percentiles + [0.5])))
         stats = [num.count(split_every=split_every),
                  num.mean(split_every=split_every),
                  num.std(split_every=split_every),
                  num.min(split_every=split_every),
-                 num.quantile([0.25, 0.5, 0.75]),
+                 num.quantile(percentiles),
                  num.max(split_every=split_every)]
         stats_names = [(s._name, 0) for s in stats]
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/dask/dataframe/methods.py 
new/dask-0.19.4/dask/dataframe/methods.py
--- old/dask-0.19.3/dask/dataframe/methods.py   2018-10-05 20:48:56.000000000 
+0200
+++ new/dask-0.19.4/dask/dataframe/methods.py   2018-10-09 21:03:29.000000000 
+0200
@@ -121,7 +121,7 @@
     typ = pd.DataFrame if isinstance(count, pd.Series) else pd.Series
     part1 = typ([count, mean, std, min],
                 index=['count', 'mean', 'std', 'min'])
-    q.index = ['25%', '50%', '75%']
+    q.index = ['{0:g}%'.format(l * 100) for l in q.index.tolist()]
     part3 = typ([max], index=['max'])
     return pd.concat([part1, q, part3])
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/dask/dataframe/tests/test_dataframe.py 
new/dask-0.19.4/dask/dataframe/tests/test_dataframe.py
--- old/dask-0.19.3/dask/dataframe/tests/test_dataframe.py      2018-10-04 
23:25:10.000000000 +0200
+++ new/dask-0.19.4/dask/dataframe/tests/test_dataframe.py      2018-10-09 
21:03:29.000000000 +0200
@@ -288,6 +288,9 @@
 
     assert_eq(s.describe(), ds.describe())
     assert_eq(df.describe(), ddf.describe())
+    test_quantiles = [0.25, 0.75]
+    assert_eq(df.describe(percentiles=test_quantiles),
+              ddf.describe(percentiles=test_quantiles))
     assert_eq(s.describe(), ds.describe(split_every=2))
     assert_eq(df.describe(), ddf.describe(split_every=2))
 
@@ -3279,3 +3282,17 @@
 
     a = ddf.map_partitions(lambda x, y: x, big)
     assert any(big is v for v in a.dask.values())
+
+
+def test_partitions_indexer():
+    df = pd.DataFrame({'x': range(10)})
+    ddf = dd.from_pandas(df, npartitions=5)
+
+    assert_eq(ddf.partitions[0], ddf.get_partition(0))
+    assert_eq(ddf.partitions[3], ddf.get_partition(3))
+    assert_eq(ddf.partitions[-1], ddf.get_partition(4))
+
+    assert ddf.partitions[:3].npartitions == 3
+    assert ddf.x.partitions[:3].npartitions == 3
+
+    assert ddf.x.partitions[::2].compute().tolist() == [0, 1, 4, 5, 8, 9]
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/dask/datasets.py 
new/dask-0.19.4/dask/datasets.py
--- old/dask-0.19.3/dask/datasets.py    2018-10-05 20:48:54.000000000 +0200
+++ new/dask-0.19.4/dask/datasets.py    2018-10-09 17:02:36.000000000 +0200
@@ -135,8 +135,8 @@
         'telephone': field('person.telephone'),
         'address': {'address': field('address.address'),
                     'city': field('address.city')},
-        'credt-card': {'number': field('payment.credit_card_number'),
-                       'expiration-date': 
field('payment.credit_card_expiration_date')},
+        'credit-card': {'number': field('payment.credit_card_number'),
+                        'expiration-date': 
field('payment.credit_card_expiration_date')},
     }
 
     return _make_mimesis({'locale': locale}, schema, npartitions, 
records_per_partition, seed)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/dask/tests/test_base.py 
new/dask-0.19.4/dask/tests/test_base.py
--- old/dask-0.19.3/dask/tests/test_base.py     2018-10-05 20:48:54.000000000 
+0200
+++ new/dask-0.19.4/dask/tests/test_base.py     2018-10-09 18:48:12.000000000 
+0200
@@ -811,7 +811,7 @@
         assert dsk == dict(y.dask)  # but they aren't
         return dask.get(dsk, keys)
 
-    with dask.config.set(array_optimize=None, get=my_get):
+    with dask.config.set(array_optimize=None, scheduler=my_get):
         y.compute()
 
 
@@ -856,3 +856,14 @@
     with dask.config.set(scheduler='threads'):
         assert get_scheduler(scheduler='threads') is dask.threaded.get
     assert get_scheduler() is None
+
+
+def test_callable_scheduler():
+    called = [False]
+
+    def get(dsk, keys, *args, **kwargs):
+        called[0] = True
+        return dask.get(dsk, keys)
+
+    assert delayed(lambda: 1)().compute(scheduler=get) == 1
+    assert called[0]
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/dask/tests/test_distributed.py 
new/dask-0.19.4/dask/tests/test_distributed.py
--- old/dask-0.19.3/dask/tests/test_distributed.py      2018-09-19 
13:54:36.000000000 +0200
+++ new/dask-0.19.4/dask/tests/test_distributed.py      2018-10-09 
18:48:12.000000000 +0200
@@ -176,3 +176,11 @@
                 a = da.ones((3, 3), chunks=c)
                 z = zarr.zeros_like(a, chunks=c)
                 a.to_zarr(z)
+
+
+def test_scheduler_equals_client(loop):
+    with cluster() as (s, [a, b]):
+        with Client(s['address'], loop=loop) as client:
+            x = delayed(lambda: 1)()
+            assert x.compute(scheduler=client) == 1
+            assert client.run_on_scheduler(lambda dask_scheduler: 
dask_scheduler.story(x.key))
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/dask.egg-info/PKG-INFO 
new/dask-0.19.4/dask.egg-info/PKG-INFO
--- old/dask-0.19.3/dask.egg-info/PKG-INFO      2018-10-05 20:57:35.000000000 
+0200
+++ new/dask-0.19.4/dask.egg-info/PKG-INFO      2018-10-09 21:27:57.000000000 
+0200
@@ -1,12 +1,11 @@
-Metadata-Version: 1.2
+Metadata-Version: 2.1
 Name: dask
-Version: 0.19.3
+Version: 0.19.4
 Summary: Parallel PyData with Task Scheduling
 Home-page: http://github.com/dask/dask/
-Author: Matthew Rocklin
-Author-email: [email protected]
+Maintainer: Matthew Rocklin
+Maintainer-email: [email protected]
 License: BSD
-Description-Content-Type: UNKNOWN
 Description: Dask
         ====
         
@@ -45,3 +44,9 @@
 Classifier: Programming Language :: Python :: 3.6
 Classifier: Programming Language :: Python :: 3.7
 Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*
+Provides-Extra: dataframe
+Provides-Extra: array
+Provides-Extra: bag
+Provides-Extra: distributed
+Provides-Extra: delayed
+Provides-Extra: complete
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/dask.egg-info/SOURCES.txt 
new/dask-0.19.4/dask.egg-info/SOURCES.txt
--- old/dask-0.19.3/dask.egg-info/SOURCES.txt   2018-10-05 20:57:35.000000000 
+0200
+++ new/dask-0.19.4/dask.egg-info/SOURCES.txt   2018-10-09 21:27:57.000000000 
+0200
@@ -252,7 +252,6 @@
 docs/source/index.rst
 docs/source/install.rst
 docs/source/logos.rst
-docs/source/machine-learning.rst
 docs/source/optimize.rst
 docs/source/presentations.rst
 docs/source/remote-data-services.rst
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/docs/source/api.rst 
new/dask-0.19.4/docs/source/api.rst
--- old/dask-0.19.3/docs/source/api.rst 2018-09-30 16:48:25.000000000 +0200
+++ new/dask-0.19.4/docs/source/api.rst 2018-10-09 15:05:48.000000000 +0200
@@ -5,7 +5,7 @@
 
 -  The :doc:`Dask Array API <array-api>` follows the Numpy API
 -  The :doc:`Dask Dataframe API <dataframe-api>` follows the Pandas API
--  The `Dask-ML API <https://ml.dask.org/en/latest/modules/api.html>`_ follows 
the Scikit-Learn API and other related machine learning libraries
+-  The `Dask-ML API <https://ml.dask.org/modules/api.html>`_ follows the 
Scikit-Learn API and other related machine learning libraries
 -  The :doc:`Dask Bag API <bag-api>` follows the map/filter/groupby/reduce API 
common in PySpark, PyToolz, and the Python standard library
 -  The :doc:`Dask Delayed API <delayed-api>` wraps general Python code
 -  The :doc:`Real-time Futures API <futures>` follows the `concurrent.futures 
<https://docs.python.org/3/library/concurrent.futures.html>`_ API from the 
standard library.
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/docs/source/changelog.rst 
new/dask-0.19.4/docs/source/changelog.rst
--- old/dask-0.19.3/docs/source/changelog.rst   2018-10-05 20:56:12.000000000 
+0200
+++ new/dask-0.19.4/docs/source/changelog.rst   2018-10-09 21:25:03.000000000 
+0200
@@ -1,6 +1,37 @@
 Changelog
 =========
 
+0.19.4 / 2018-10-09
+-------------------
+
+Array
++++++
+
+-  Implement ``apply_gufunc(..., axes=..., keepdims=...)`` (:pr:`3985`) 
`Markus Gonser`_
+
+Bag
++++
+
+-  Fix typo in datasets.make_people (:pr:`4069`) `Matthew Rocklin`_
+
+Dataframe
++++++++++
+
+-  Added `percentiles` options for `dask.dataframe.describe` method 
(:pr:`4067`) `Zhenqing Li`_
+-  Add DataFrame.partitions accessor similar to Array.blocks (:pr:`4066`) 
`Matthew Rocklin`_
+
+Core
+++++
+
+-  Pass get functions and Clients through scheduler keyword (:pr:`4062`) 
`Matthew Rocklin`_
+
+Documentation
++++++++++++++
+
+-  Fix Typo on hpc example. (missing `=` in kwarg). (:pr:`4068`) `Matthias 
Bussonier`_
+-  Extensive copy-editing: (:pr:`4065`), (:pr:`4064`), (:pr:`4063`) `Miguel 
Farrajota`_
+
+
 0.19.3 / 2018-10-05
 -------------------
 
@@ -1460,3 +1491,5 @@
 .. _`Jeremy Chan`: https://github.com/convexset
 .. _`Eric Wolak`: https://github.com/epall
 .. _`Miguel Farrajota`: https://github.com/farrajota
+.. _`Zhenqing Li`: https://github.com/DigitalPig
+.. _`Matthias Bussonier`: https://github.com/Carreau
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/docs/source/dataframe-api.rst 
new/dask-0.19.4/docs/source/dataframe-api.rst
--- old/dask-0.19.3/docs/source/dataframe-api.rst       2018-09-26 
15:00:28.000000000 +0200
+++ new/dask-0.19.4/docs/source/dataframe-api.rst       2018-10-09 
17:02:36.000000000 +0200
@@ -55,6 +55,7 @@
     DataFrame.ndim
     DataFrame.nlargest
     DataFrame.npartitions
+    DataFrame.partitions
     DataFrame.pow
     DataFrame.quantile
     DataFrame.query
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/docs/source/index.rst 
new/dask-0.19.4/docs/source/index.rst
--- old/dask-0.19.3/docs/source/index.rst       2018-09-30 16:48:25.000000000 
+0200
+++ new/dask-0.19.4/docs/source/index.rst       2018-10-09 15:05:48.000000000 
+0200
@@ -171,7 +171,7 @@
    dataframe.rst
    delayed.rst
    futures.rst
-   machine-learning.rst
+   Machine Learning <https://ml.dask.org>
    api.rst
 
 **Scheduling**
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/docs/source/machine-learning.rst 
new/dask-0.19.4/docs/source/machine-learning.rst
--- old/dask-0.19.3/docs/source/machine-learning.rst    2018-09-30 
16:48:25.000000000 +0200
+++ new/dask-0.19.4/docs/source/machine-learning.rst    1970-01-01 
01:00:00.000000000 +0100
@@ -1,11 +0,0 @@
-Machine Learning
-================
-
-Dask facilitates machine learning, statistics, and optimization workloads in a
-variety of ways.  Generally Dask tries to support other high-quality solutions
-within the PyData ecosystem rather than reinvent new systems.  Dask makes it
-easier to scale single-machine libraries like Scikit-Learn where possible and
-makes using distributed libraries like XGBoost or Tensorflow more comfortable
-for everyday users.
-
-See the separate `Dask-ML documentation <https://ml.dask.org/en/latest>`_ for 
more information.
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/docs/source/setup/cloud.rst 
new/dask-0.19.4/docs/source/setup/cloud.rst
--- old/dask-0.19.3/docs/source/setup/cloud.rst 2018-09-24 15:51:28.000000000 
+0200
+++ new/dask-0.19.4/docs/source/setup/cloud.rst 2018-10-09 16:10:28.000000000 
+0200
@@ -1,9 +1,8 @@
 Cloud Deployments
 =================
 
-To get started running Dask on common Cloud providers
-like Amazon, Google, or Microsoft
-we currently recommend deploying
+To get started running Dask on common Cloud providers like Amazon, 
+Google, or Microsoft, we currently recommend deploying
 :doc:`Dask with Kubernetes and Helm <kubernetes-helm>`.
 
 All three major cloud vendors now provide managed Kubernetes services.
@@ -14,7 +13,7 @@
 -----------
 
 You may want to install additional libraries in your Jupyter and worker images
-to access the object stores of each cloud
+to access the object stores of each cloud:
 
 -  `s3fs <https://s3fs.readthedocs.io/>`_ for Amazon's S3
 -  `gcsfs <https://gcsfs.readthedocs.io/>`_ for Google's GCS
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/docs/source/setup/hpc.rst 
new/dask-0.19.4/docs/source/setup/hpc.rst
--- old/dask-0.19.3/docs/source/setup/hpc.rst   2018-10-05 20:48:54.000000000 
+0200
+++ new/dask-0.19.4/docs/source/setup/hpc.rst   2018-10-09 17:02:32.000000000 
+0200
@@ -37,7 +37,7 @@
    from dask_jobqueue import PBSCluster
 
    cluster = PBSCluster(cores=36,
-                        memory"100GB",
+                        memory="100GB",
                         project='P48500028',
                         queue='premium',
                         walltime='02:00:00')
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/docs/source/setup/kubernetes-helm.rst 
new/dask-0.19.4/docs/source/setup/kubernetes-helm.rst
--- old/dask-0.19.3/docs/source/setup/kubernetes-helm.rst       2018-09-24 
15:51:28.000000000 +0200
+++ new/dask-0.19.4/docs/source/setup/kubernetes-helm.rst       2018-10-09 
16:10:28.000000000 +0200
@@ -1,18 +1,18 @@
 Kubernetes and Helm
 ===================
 
-It is easy to launch a Dask cluster and Jupyter notebook server on cloud
+It is easy to launch a Dask cluster and a Jupyter notebook server on cloud
 resources using Kubernetes_ and Helm_.
 
 .. _Kubernetes: https://kubernetes.io/
 .. _Helm: https://helm.sh/
 
 This is particularly useful when you want to deploy a fresh Python environment
-on Cloud services, like Amazon Web Services, Google Compute Engine, or
+on Cloud services like Amazon Web Services, Google Compute Engine, or
 Microsoft Azure.
 
 If you already have Python environments running in a pre-existing Kubernetes
-cluster then you may prefer the :doc:`Kubernetes native<kubernetes-native>`
+cluster, then you may prefer the :doc:`Kubernetes native<kubernetes-native>`
 documentation, which is a bit lighter weight.
 
 
@@ -21,17 +21,17 @@
 
 This document assumes that you have a Kubernetes cluster and Helm installed.
 
-If this is not the case then you might consider setting up a Kubernetes cluster
-either on one of the common cloud providers like Google, Amazon, or
-Microsoft's.  We recommend the first part of the documentation in the guide
+If this is not the case, then you might consider setting up a Kubernetes 
cluster
+on one of the common cloud providers like Google, Amazon, or
+Microsoft.  We recommend the first part of the documentation in the guide
 `Zero to JupyterHub <http://zero-to-jupyterhub.readthedocs.io/en/latest/>`_
-that focuses on Kubernetes and Helm.  You do not need to follow all of these
-instructions.  JupyterHub is not necessary to deploy Dask:
+that focuses on Kubernetes and Helm (you do not need to follow all of these
+instructions).  Also, JupyterHub is not necessary to deploy Dask:
 
 - `Creating a Kubernetes Cluster 
<https://zero-to-jupyterhub.readthedocs.io/en/v0.4-doc/create-k8s-cluster.html>`_
 - `Setting up Helm 
<https://zero-to-jupyterhub.readthedocs.io/en/v0.4-doc/setup-helm.html>`_
 
-Alternatively you may want to experiment with Kubernetes locally using
+Alternatively, you may want to experiment with Kubernetes locally using
 `Minikube <https://kubernetes.io/docs/getting-started-guides/minikube/>`_.
 
 
@@ -45,7 +45,7 @@
 
    helm repo update
 
-Now you can launch Dask on your Kubernetes cluster using the Dask Helm_ chart::
+Now, you can launch Dask on your Kubernetes cluster using the Dask Helm_ 
chart::
 
    helm install stable/dask
 
@@ -56,7 +56,7 @@
 Verify Deployment
 -----------------
 
-This might make a minute to deploy.  You can check on the status with
+This might take a minute to deploy.  You can check its status with
 ``kubectl``::
 
    kubectl get pods
@@ -82,9 +82,9 @@
 
 Notice the name ``bald-eel``.  This is the name that Helm has given to your
 particular deployment of Dask.  You could, for example, have multiple
-Dask-and-Jupyter clusters running at once and each would be given a different
-name.  You will use this name to refer to your deployment in the future.  You
-can list all active helm deployments with::
+Dask-and-Jupyter clusters running at once, and each would be given a different
+name.  Note that you will need to use this name to refer to your deployment in 
the future.  
+Additionally, you can list all active helm deployments with::
 
    helm list
 
@@ -95,7 +95,7 @@
 Connect to Dask and Jupyter
 ---------------------------
 
-When we ran ``kubectl get services`` we saw some externally visible IPs
+When we ran ``kubectl get services``, we saw some externally visible IPs:
 
 .. code-block:: bash
 
@@ -105,8 +105,8 @@
    bald-eel-scheduler   LoadBalancer   10.11.245.241   35.202.201.129   
8786:31166/TCP,80:31626/TCP   2m
    kubernetes           ClusterIP      10.11.240.1     <none>           
443/TCP                       48m
 
-We can navigate to these from any web browser.  One is the Dask diagnostic
-dashboard.  The other is the Jupyter server.  You can log into the Jupyter
+We can navigate to these services from any web browser. Here, one is the Dask 
diagnostic
+dashboard, and the other is the Jupyter server.  You can log into the Jupyter
 notebook server with the password, ``dask``.
 
 You can create a notebook and create a Dask client from there.  The
@@ -131,12 +131,12 @@
 Configure Environment
 ---------------------
 
-By default the Helm deployment launches three workers using two cores each and
+By default, the Helm deployment launches three workers using two cores each and
 a standard conda environment.  We can customize this environment by creating a
 small yaml file that implements a subset of the values in the
-`dask helm chart values.yaml file 
<https://github.com/dask/helm-chart/blob/master/dask/values.yaml>`_
+`dask helm chart values.yaml file 
<https://github.com/dask/helm-chart/blob/master/dask/values.yaml>`_.
 
-For example we can increase the number of workers, and include extra conda and
+For example, we can increase the number of workers, and include extra conda and
 pip packages to install on the both the workers and Jupyter server (these two
 environments should be matched).
 
@@ -168,13 +168,13 @@
        - name: EXTRA_PIP_PACKAGES
          value: s3fs dask-ml --upgrade
 
-This config file overrides configuration for number and size of workers and the
+This config file overrides the configuration for the number and size of 
workers and the
 conda and pip packages installed on the worker and Jupyter containers.  In
-general we will want to make sure that these two software environments match.
+general, we will want to make sure that these two software environments match.
 
 Update your deployment to use this configuration file.  Note that *you will not
-use helm install* for this stage.   That would create a *new* deployment on the
-same Kubernetes cluster.  Instead you will upgrade your existing deployment by
+use helm install* for this stage: that would create a *new* deployment on the
+same Kubernetes cluster.  Instead, you will upgrade your existing deployment by
 using the current name::
 
     helm upgrade bald-eel stable/dask -f config.yaml
@@ -188,10 +188,10 @@
 Check status and logs
 ---------------------
 
-For standard issues you should be able to see worker status and logs using the
-Dask dashboard (in particular see the worker links from the ``info/`` page).
-However if your workers aren't starting you can check on the status of pods and
-their logs with the following commands
+For standard issues, you should be able to see the worker status and logs 
using the
+Dask dashboard (in particular, you can see the worker links from the ``info/`` 
page).
+However, if your workers aren't starting, you can check the status of pods and
+their logs with the following commands:
 
 .. code-block:: bash
 
@@ -228,15 +228,15 @@
    ...
 
 
-Delete Helm deployment
-----------------------
+Delete a Helm deployment
+------------------------
 
 You can always delete a helm deployment using its name::
 
    helm delete bald-eel --purge
 
 Note that this does not destroy any clusters that you may have allocated on a
-Cloud service, you will need to delete those explicitly.
+Cloud service (you will need to delete those explicitly).
 
 
 Avoid the Jupyter Server
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/docs/source/setup/python-advanced.rst 
new/dask-0.19.4/docs/source/setup/python-advanced.rst
--- old/dask-0.19.3/docs/source/setup/python-advanced.rst       2018-09-24 
15:51:28.000000000 +0200
+++ new/dask-0.19.4/docs/source/setup/python-advanced.rst       2018-10-09 
16:10:28.000000000 +0200
@@ -1,7 +1,7 @@
 Python API (advanced)
 =====================
 
-In some rare cases experts may want to create ``Scheduler`` and ``Worker``
+In some rare cases, experts may want to create ``Scheduler`` and ``Worker``
 objects explicitly in Python manually.  This is often necessary when making
 tools to automatically deploy Dask in custom settings.
 
@@ -11,7 +11,7 @@
 Scheduler
 ---------
 
-Start the Scheduler, provide the listening port (defaults to 8786) and Tornado
+To start the Scheduler, provide the listening port (defaults to 8786) and 
Tornado
 IOLoop (defaults to ``IOLoop.current()``)
 
 .. code-block:: python
@@ -27,7 +27,7 @@
    loop.start()
 
 Alternatively, you may want the IOLoop and scheduler to run in a separate
-thread.  In that case you would replace the ``loop.start()`` call with the
+thread.  In this case, you would replace the ``loop.start()`` call with the
 following:
 
 .. code-block:: python
@@ -39,7 +39,7 @@
 Worker
 ------
 
-On other nodes start worker processes that point to the URL of the scheduler.
+On other nodes, start worker processes that point to the URL of the scheduler.
 
 .. code-block:: python
 
@@ -55,8 +55,8 @@
 
 Alternatively, replace ``Worker`` with ``Nanny`` if you want your workers to be
 managed in a separate process by a local nanny process.  This allows workers to
-restart themselves in case of failure, provides some additional monitoring, and
-is useful when coordinating many workers that should live in different
-processes to avoid the GIL_.
+restart themselves in case of failure. Also, it provides some additional 
monitoring, 
+and is useful when coordinating many workers that should live in different
+processes in order to avoid the GIL_.
 
 .. _GIL: https://docs.python.org/3/glossary.html#term-gil
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-0.19.3/docs/source/spark.rst 
new/dask-0.19.4/docs/source/spark.rst
--- old/dask-0.19.3/docs/source/spark.rst       2018-09-30 16:48:25.000000000 
+0200
+++ new/dask-0.19.4/docs/source/spark.rst       2018-10-09 15:05:48.000000000 
+0200
@@ -98,7 +98,7 @@
         - Dask allows you to specify arbitrary task graphs for more complex and
           custom systems that are not part of the standard set of collections.
 
-.. _dask-ml: https://ml.dask.org/en/latest
+.. _dask-ml: https://ml.dask.org
 
 
 Reasons you might choose Spark

commit python-dask for openSUSE:Factory

Reply via email to