commit python-dask for openSUSE:Factory

root Sun, 17 Nov 2019 10:23:58 -0800

Hello community,

here is the log from the commit of package python-dask for openSUSE:Factory 
checked in at 2019-11-17 19:23:27
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/python-dask (Old)
 and      /work/SRC/openSUSE:Factory/.python-dask.new.26869 (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Package is "python-dask"

Sun Nov 17 19:23:27 2019 rev:23 rq:749097 version:2.8.0

Changes:
--------
--- /work/SRC/openSUSE:Factory/python-dask/python-dask.changes  2019-11-13 
13:26:41.603595513 +0100
+++ /work/SRC/openSUSE:Factory/.python-dask.new.26869/python-dask.changes       
2019-11-17 19:23:28.898857695 +0100
@@ -1,0 +2,34 @@
+Sat Nov 16 17:53:12 UTC 2019 - Arun Persaud <a...@gmx.de>
+
+- update to version 2.8.0:
+  * Array
+    + Implement complete dask.array.tile function (:pr:`5574`) Bouwe
+      Andela
+    + Add median along an axis with automatic rechunking (:pr:`5575`)
+      Matthew Rocklin
+    + Allow da.asarray to chunk inputs (:pr:`5586`) Matthew Rocklin
+  * Bag
+    + Use key_split in Bag name (:pr:`5571`) Matthew Rocklin
+  * Core
+    + Switch Doctests to Py3.7 (:pr:`5573`) Ryan Nazareth
+    + Relax get_colors test to adapt to new Bokeh release (:pr:`5576`)
+      Matthew Rocklin
+    + Add dask.blockwise.fuse_roots optimization (:pr:`5451`) Matthew
+      Rocklin
+    + Add sizeof implementation for small dicts (:pr:`5578`) Matthew
+      Rocklin
+    + Update fsspec, gcsfs, s3fs (:pr:`5588`) Tom Augspurger
+  * DataFrame
+    + Add dropna argument to groupby (:pr:`5579`) Richard J Zamora
+    + Revert "Remove import of dask_cudf, which is now a part of cudf
+      (:pr:`5568`)" (:pr:`5590`) Matthew Rocklin
+  * Documentation
+    + Add best practice for dask.compute function (:pr:`5583`) Matthew
+      Rocklin
+    + Create FUNDING.yml (:pr:`5587`) Gina Helfrich
+    + Add screencast for coordination primitives (:pr:`5593`) Matthew
+      Rocklin
+    + Move funding to .github repo (:pr:`5589`) Tom Augspurger
+    + Update calendar link (:pr:`5569`) Tom Augspurger
+
+-------------------------------------------------------------------

Old:
----
  dask-2.7.0.tar.gz

New:
----
  dask-2.8.0.tar.gz

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Other differences:
------------------
++++++ python-dask.spec ++++++
--- /var/tmp/diff_new_pack.Bj0fHb/_old  2019-11-17 19:23:29.442857463 +0100
+++ /var/tmp/diff_new_pack.Bj0fHb/_new  2019-11-17 19:23:29.446857462 +0100
@@ -27,12 +27,11 @@
 %endif
 %define         skip_python2 1
 Name:           python-dask%{psuffix}
-Version:        2.7.0
+Version:        2.8.0
 Release:        0
 Summary:        Minimal task scheduling abstraction
 License:        BSD-3-Clause
-Group:          Development/Languages/Python
-URL:            http://github.com/ContinuumIO/dask/
+URL:            https://github.com/ContinuumIO/dask/
 Source:         
https://files.pythonhosted.org/packages/source/d/dask/dask-%{version}.tar.gz
 BuildRequires:  %{python_module setuptools}
 BuildRequires:  fdupes
@@ -104,7 +103,6 @@
 # This must have a Requires for dask and all the dask subpackages
 %package all
 Summary:        All dask components
-Group:          Development/Languages/Python
 Requires:       %{name} = %{version}
 Requires:       %{name}-array = %{version}
 Requires:       %{name}-bag = %{version}
@@ -125,7 +123,6 @@
 
 %package array
 Summary:        Numpy-like array data structure for dask
-Group:          Development/Languages/Python
 Requires:       %{name} = %{version}
 Requires:       python-numpy >= 1.13.0
 Recommends:     python-chest
@@ -149,7 +146,6 @@
 
 %package bag
 Summary:        Data structure generic python objects in dask
-Group:          Development/Languages/Python
 Requires:       %{name} = %{version}
 Requires:       %{name}-multiprocessing = %{version}
 Requires:       python-cloudpickle >= 0.2.1
@@ -173,7 +169,6 @@
 
 %package dataframe
 Summary:        Pandas-like DataFrame data structure for dask
-Group:          Development/Languages/Python
 Requires:       %{name} = %{version}
 Requires:       %{name}-array = %{version}
 Requires:       %{name}-multiprocessing = %{version}
@@ -208,7 +203,6 @@
 
 %package distributed
 Summary:        Interface with the distributed task scheduler in dask
-Group:          Development/Languages/Python
 Requires:       %{name} = %{version}
 Requires:       python-distributed >= 2.0
 
@@ -228,7 +222,6 @@
 
 %package dot
 Summary:        Display dask graphs using graphviz
-Group:          Development/Languages/Python
 Requires:       %{name} = %{version}
 Requires:       graphviz
 Requires:       graphviz-gd
@@ -247,7 +240,6 @@
 
 %package multiprocessing
 Summary:        Display dask graphs using graphviz
-Group:          Development/Languages/Python
 Requires:       %{name} = %{version}
 Requires:       python-cloudpickle >= 0.2.1
 Requires:       python-partd >= 0.3.7
@@ -282,7 +274,7 @@
 #   test_persist
 #   test_local_get_with_distributed_active
 #   test_local_scheduler
-%python_expand PYTHONPATH=%{buildroot}%{$python_sitelib} 
py.test-%{python_bin_suffix} -v dask/tests -k 'not 
(test_serializable_groupby_agg or test_persist or 
test_local_get_with_distributed_active or test_await or test_local_scheduler)'
+%pytest dask/tests -k 'not (test_serializable_groupby_agg or test_persist or 
test_local_get_with_distributed_active or test_await or test_local_scheduler)'
 %endif
 
 %if !%{with test}

++++++ dask-2.7.0.tar.gz -> dask-2.8.0.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/PKG-INFO new/dask-2.8.0/PKG-INFO
--- old/dask-2.7.0/PKG-INFO     2019-11-08 22:06:23.000000000 +0100
+++ new/dask-2.8.0/PKG-INFO     2019-11-14 23:57:18.000000000 +0100
@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: dask
-Version: 2.7.0
+Version: 2.8.0
 Summary: Parallel PyData with Task Scheduling
 Home-page: https://github.com/dask/dask/
 Maintainer: Matthew Rocklin
@@ -43,10 +43,10 @@
 Classifier: Programming Language :: Python :: 3.6
 Classifier: Programming Language :: Python :: 3.7
 Requires-Python: >=3.6
-Provides-Extra: dataframe
-Provides-Extra: delayed
+Provides-Extra: complete
 Provides-Extra: array
-Provides-Extra: distributed
 Provides-Extra: diagnostics
+Provides-Extra: dataframe
 Provides-Extra: bag
-Provides-Extra: complete
+Provides-Extra: delayed
+Provides-Extra: distributed
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/_version.py 
new/dask-2.8.0/dask/_version.py
--- old/dask-2.7.0/dask/_version.py     2019-11-08 22:06:23.000000000 +0100
+++ new/dask-2.8.0/dask/_version.py     2019-11-14 23:57:18.000000000 +0100
@@ -11,8 +11,8 @@
 {
  "dirty": false,
  "error": null,
- "full-revisionid": "98a1e61fcf9230e3a4dcdf8523e435ed83dfb2c0",
- "version": "2.7.0"
+ "full-revisionid": "539d1e27a8ccce01de5f3d49f1748057c27552f2",
+ "version": "2.8.0"
 }
 '''  # END VERSION_JSON
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/array/__init__.py 
new/dask-2.8.0/dask/array/__init__.py
--- old/dask-2.7.0/dask/array/__init__.py       2019-10-11 05:14:07.000000000 
+0200
+++ new/dask-2.8.0/dask/array/__init__.py       2019-11-13 18:07:07.000000000 
+0100
@@ -190,6 +190,7 @@
         all,
         min,
         max,
+        median,
         moment,
         trace,
         argmin,
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/array/core.py 
new/dask-2.8.0/dask/array/core.py
--- old/dask-2.7.0/dask/array/core.py   2019-11-05 22:48:29.000000000 +0100
+++ new/dask-2.8.0/dask/array/core.py   2019-11-13 21:17:45.000000000 +0100
@@ -45,7 +45,6 @@
     is_integer,
     IndexCallable,
     funcname,
-    derived_from,
     SerializableLock,
     Dispatch,
     factors,
@@ -3672,7 +3671,7 @@
         return stack(a)
     elif not isinstance(getattr(a, "shape", None), Iterable):
         a = np.asarray(a)
-    return from_array(a, chunks=a.shape, getitem=getter_inline, **kwargs)
+    return from_array(a, getitem=getter_inline, **kwargs)
 
 
 def asanyarray(a):
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/array/creation.py 
new/dask-2.8.0/dask/array/creation.py
--- old/dask-2.7.0/dask/array/creation.py       2019-10-11 05:14:07.000000000 
+0200
+++ new/dask-2.8.0/dask/array/creation.py       2019-11-12 16:54:02.000000000 
+0100
@@ -796,17 +796,28 @@
 
 @derived_from(np)
 def tile(A, reps):
-    if not isinstance(reps, Integral):
-        raise NotImplementedError("Only integer valued `reps` supported.")
-
-    if reps < 0:
+    try:
+        tup = tuple(reps)
+    except TypeError:
+        tup = (reps,)
+    if any(i < 0 for i in tup):
         raise ValueError("Negative `reps` are not allowed.")
-    elif reps == 0:
-        return A[..., :0]
-    elif reps == 1:
-        return A
+    c = asarray(A)
+
+    if all(tup):
+        for nrep in tup[::-1]:
+            c = nrep * [c]
+        return block(c)
 
-    return concatenate(reps * [A], axis=-1)
+    d = len(tup)
+    if d < c.ndim:
+        tup = (1,) * (c.ndim - d) + tup
+    if c.ndim < d:
+        shape = (1,) * (d - c.ndim) + c.shape
+    else:
+        shape = c.shape
+    shape_out = tuple(s * t for s, t in zip(shape, tup))
+    return empty(shape=shape_out, dtype=c.dtype)
 
 
 def expand_pad_value(array, pad_value):
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/array/optimization.py 
new/dask-2.8.0/dask/array/optimization.py
--- old/dask-2.7.0/dask/array/optimization.py   2019-11-07 03:43:11.000000000 
+0100
+++ new/dask-2.8.0/dask/array/optimization.py   2019-11-13 18:07:07.000000000 
+0100
@@ -4,7 +4,7 @@
 import numpy as np
 
 from .core import getter, getter_nofancy, getter_inline
-from ..blockwise import optimize_blockwise
+from ..blockwise import optimize_blockwise, fuse_roots
 from ..core import flatten, reverse_dict
 from ..optimization import cull, fuse, inline_functions
 from ..utils import ensure_dict
@@ -40,6 +40,7 @@
     # High level stage optimization
     if isinstance(dsk, HighLevelGraph):
         dsk = optimize_blockwise(dsk, keys=keys)
+        dsk = fuse_roots(dsk, keys=keys)
 
     # Low level task optimizations
     dsk = ensure_dict(dsk)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/array/reductions.py 
new/dask-2.8.0/dask/array/reductions.py
--- old/dask-2.7.0/dask/array/reductions.py     2019-11-08 20:58:43.000000000 
+0100
+++ new/dask-2.8.0/dask/array/reductions.py     2019-11-13 18:07:07.000000000 
+0100
@@ -1,4 +1,5 @@
 import builtins
+from collections.abc import Iterable
 import operator
 from functools import partial, wraps
 from itertools import product, repeat
@@ -20,7 +21,7 @@
 from .numpy_compat import ma_divide, divide as np_divide
 from ..base import tokenize
 from ..highlevelgraph import HighLevelGraph
-from ..utils import ignoring, funcname, Dispatch, deepmap, getargspec
+from ..utils import ignoring, funcname, Dispatch, deepmap, getargspec, 
derived_from
 from .. import config
 
 # Generic functions to support chunks of different types
@@ -1251,3 +1252,36 @@
 @wraps(np.trace)
 def trace(a, offset=0, axis1=0, axis2=1, dtype=None):
     return diagonal(a, offset=offset, axis1=axis1, axis2=axis2).sum(-1, 
dtype=dtype)
+
+
+@derived_from(np)
+def median(a, axis=None, keepdims=False, out=None):
+    """
+    This works by automatically chunking the reduced axes to a single chunk
+    and then calling ``numpy.median`` function across the remaining dimensions
+    """
+    if axis is None:
+        raise NotImplementedError(
+            "The da.median function only works along an axis.  "
+            "The full algorithm is difficult to do in parallel"
+        )
+
+    if not isinstance(axis, Iterable):
+        axis = (axis,)
+
+    axis = [ax + a.ndim if ax < 0 else ax for ax in axis]
+
+    a = a.rechunk({ax: -1 if ax in axis else "auto" for ax in range(a.ndim)})
+
+    result = a.map_blocks(
+        np.median,
+        axis=axis,
+        keepdims=keepdims,
+        drop_axis=axis if not keepdims else None,
+        chunks=[1 if ax in axis else c for ax, c in enumerate(a.chunks)]
+        if keepdims
+        else None,
+    )
+
+    result = handle_out(out, result)
+    return result
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/array/tests/test_array_core.py 
new/dask-2.8.0/dask/array/tests/test_array_core.py
--- old/dask-2.7.0/dask/array/tests/test_array_core.py  2019-11-05 
22:48:29.000000000 +0100
+++ new/dask-2.8.0/dask/array/tests/test_array_core.py  2019-11-13 
21:17:45.000000000 +0100
@@ -2372,6 +2372,13 @@
             assert not any(isinstance(v, np.ndarray) for v in x.dask.values())
 
 
+def test_asarray_chunks():
+    with dask.config.set({"array.chunk-size": "100 B"}):
+        x = np.ones(1000)
+        d = da.asarray(x)
+        assert d.npartitions > 1
+
+
 @pytest.mark.filterwarnings("ignore:the matrix subclass")
 def test_asanyarray():
     x = np.matrix([1, 2, 3])
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/array/tests/test_array_function.py 
new/dask-2.8.0/dask/array/tests/test_array_function.py
--- old/dask-2.7.0/dask/array/tests/test_array_function.py      2019-10-11 
05:14:07.000000000 +0200
+++ new/dask-2.8.0/dask/array/tests/test_array_function.py      2019-11-13 
18:07:07.000000000 +0100
@@ -61,7 +61,6 @@
         lambda x: np.min_scalar_type(x),
         lambda x: np.linalg.det(x),
         lambda x: np.linalg.eigvals(x),
-        lambda x: np.median(x),
     ],
 )
 def test_array_notimpl_function_dask(func):
@@ -226,15 +225,14 @@
     assert_eq(xx, yy, check_meta=False)
 
 
-def test_median_func():
+def test_non_existent_func():
     # Regression test for __array_function__ becoming default in numpy 1.17
-    # dask has no median function, so ensure that this still calls np.median
-    image = da.from_array(np.array([[0, 1], [1, 2]]), chunks=(1, 2))
+    # dask has no sort function, so ensure that this still calls np.sort
+    x = da.from_array(np.array([1, 2, 4, 3]), chunks=(2,))
     if IS_NEP18_ACTIVE:
         with pytest.warns(
-            FutureWarning,
-            match="The `numpy.median` function is not implemented by Dask",
+            FutureWarning, match="The `numpy.sort` function is not implemented 
by Dask"
         ):
-            assert int(np.median(image)) == 1
+            assert list(np.sort(x)) == [1, 2, 3, 4]
     else:
-        assert int(np.median(image)) == 1
+        assert list(np.sort(x)) == [1, 2, 3, 4]
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/array/tests/test_creation.py 
new/dask-2.8.0/dask/array/tests/test_creation.py
--- old/dask-2.7.0/dask/array/tests/test_creation.py    2019-10-11 
05:14:07.000000000 +0200
+++ new/dask-2.8.0/dask/array/tests/test_creation.py    2019-11-12 
16:54:02.000000000 +0100
@@ -571,9 +571,18 @@
         assert all(concat(d.repeat(r).chunks))
 
 
+@pytest.mark.parametrize("reps", [2, (2, 2), (1, 2), (2, 1), (2, 3, 4, 0)])
+def test_tile_basic(reps):
+    a = da.asarray([0, 1, 2])
+    b = [[1, 2], [3, 4]]
+
+    assert_eq(np.tile(a.compute(), reps), da.tile(a, reps))
+    assert_eq(np.tile(b, reps), da.tile(b, reps))
+
+
 @pytest.mark.parametrize("shape, chunks", [((10,), (1,)), ((10, 11, 13), (4, 
5, 3))])
-@pytest.mark.parametrize("reps", [0, 1, 2, 3, 5])
-def test_tile(shape, chunks, reps):
+@pytest.mark.parametrize("reps", [0, 1, 2, 3, 5, (1,), (1, 2)])
+def test_tile_chunks(shape, chunks, reps):
     x = np.random.random(shape)
     d = da.from_array(x, chunks=chunks)
 
@@ -591,13 +600,32 @@
 
 
 @pytest.mark.parametrize("shape, chunks", [((10,), (1,)), ((10, 11, 13), (4, 
5, 3))])
-@pytest.mark.parametrize("reps", [[1], [1, 2]])
-def test_tile_array_reps(shape, chunks, reps):
+@pytest.mark.parametrize("reps", [0, (0,), (2, 0), (0, 3, 0, 4)])
+def test_tile_zero_reps(shape, chunks, reps):
     x = np.random.random(shape)
     d = da.from_array(x, chunks=chunks)
 
-    with pytest.raises(NotImplementedError):
-        da.tile(d, reps)
+    assert_eq(np.tile(x, reps), da.tile(d, reps))
+
+
+@pytest.mark.parametrize("shape, chunks", [((1, 1, 0), (1, 1, 0)), ((2, 0), 
(1, 0))])
+@pytest.mark.parametrize("reps", [2, (3, 2, 5)])
+def test_tile_empty_array(shape, chunks, reps):
+    x = np.empty(shape)
+    d = da.from_array(x, chunks=chunks)
+
+    assert_eq(np.tile(x, reps), da.tile(d, reps))
+
+
+@pytest.mark.parametrize(
+    "shape", [(3,), (2, 3), (3, 4, 3), (3, 2, 3), (4, 3, 2, 4), (2, 2)]
+)
+@pytest.mark.parametrize("reps", [(2,), (1, 2), (2, 1), (2, 2), (2, 3, 2), (3, 
2)])
+def test_tile_np_kroncompare_examples(shape, reps):
+    x = np.random.random(shape)
+    d = da.asarray(x)
+
+    assert_eq(np.tile(x, reps), da.tile(d, reps))
 
 
 skip_stat_length = pytest.mark.xfail(_numpy_117, reason="numpy-14061")
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/array/tests/test_optimization.py 
new/dask-2.8.0/dask/array/tests/test_optimization.py
--- old/dask-2.7.0/dask/array/tests/test_optimization.py        2019-11-07 
03:43:11.000000000 +0100
+++ new/dask-2.8.0/dask/array/tests/test_optimization.py        2019-11-13 
18:07:07.000000000 +0100
@@ -389,3 +389,13 @@
     X = da.dot(X, X.T)
 
     assert_eq(X.compute(optimize_graph=False), X)
+
+
+def test_fuse_roots():
+    x = da.ones(10, chunks=(2,))
+    y = da.zeros(10, chunks=(2,))
+    z = (x + 1) + (2 * y ** 2)
+    (zz,) = dask.optimize(z)
+    # assert len(zz.dask) == 5
+    assert sum(map(dask.istask, zz.dask.values())) == 5  # there are some 
aliases
+    assert_eq(zz, z)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/array/tests/test_reductions.py 
new/dask-2.8.0/dask/array/tests/test_reductions.py
--- old/dask-2.7.0/dask/array/tests/test_reductions.py  2019-10-11 
05:14:07.000000000 +0200
+++ new/dask-2.8.0/dask/array/tests/test_reductions.py  2019-11-13 
18:07:07.000000000 +0100
@@ -663,3 +663,15 @@
     _assert(a, b, 0, 1, 2, float)
     _assert(a, b, offset=1, axis1=0, axis2=2, dtype=int)
     _assert(a, b, offset=1, axis1=0, axis2=2, dtype=float)
+
+
+@pytest.mark.parametrize("axis", [0, [0, 1], 1, -1])
+@pytest.mark.parametrize("keepdims", [True, False])
+def test_median(axis, keepdims):
+    x = np.arange(100).reshape((2, 5, 10))
+    d = da.from_array(x, chunks=2)
+
+    assert_eq(
+        da.median(d, axis=axis, keepdims=keepdims),
+        np.median(x, axis=axis, keepdims=keepdims),
+    )
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/bag/core.py 
new/dask-2.8.0/dask/bag/core.py
--- old/dask-2.7.0/dask/bag/core.py     2019-11-08 20:58:43.000000000 +0100
+++ new/dask-2.8.0/dask/bag/core.py     2019-11-12 16:54:02.000000000 +0100
@@ -81,6 +81,7 @@
     ensure_dict,
     ensure_bytes,
     ensure_unicode,
+    key_split,
 )
 from . import chunk
 
@@ -492,8 +493,7 @@
         return type(self), (self.name, self.npartitions)
 
     def __str__(self):
-        name = self.name if len(self.name) < 10 else self.name[:7] + "..."
-        return "dask.bag<%s, npartitions=%d>" % (name, self.npartitions)
+        return "dask.bag<%s, npartitions=%d>" % (key_split(self.name), 
self.npartitions)
 
     __repr__ = __str__
 
@@ -1543,10 +1543,10 @@
         >>> df = b.to_dataframe()
 
         >>> df.compute()
-           balance     name
-        0      100    Alice
-        1      200      Bob
-        0      300  Charlie
+              name  balance
+        0    Alice      100
+        1      Bob      200
+        0  Charlie      300
         """
         import pandas as pd
         import dask.dataframe as dd
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/bag/tests/test_bag.py 
new/dask-2.8.0/dask/bag/tests/test_bag.py
--- old/dask-2.7.0/dask/bag/tests/test_bag.py   2019-11-08 20:58:43.000000000 
+0100
+++ new/dask-2.8.0/dask/bag/tests/test_bag.py   2019-11-12 16:54:02.000000000 
+0100
@@ -180,6 +180,8 @@
     assert str(b.npartitions) in func(b)
     assert b.name[:5] in func(b)
 
+    assert "from_sequence" in func(db.from_sequence(range(5)))
+
 
 def test_pluck():
     d = {("x", 0): [(1, 10), (2, 20)], ("x", 1): [(3, 30), (4, 40)]}
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/blockwise.py 
new/dask-2.8.0/dask/blockwise.py
--- old/dask-2.7.0/dask/blockwise.py    2019-11-07 03:43:11.000000000 +0100
+++ new/dask-2.8.0/dask/blockwise.py    2019-11-13 18:07:07.000000000 +0100
@@ -12,7 +12,7 @@
 from .core import reverse_dict
 from .delayed import to_task_dask
 from .highlevelgraph import HighLevelGraph
-from .optimization import SubgraphCallable
+from .optimization import SubgraphCallable, fuse
 from .utils import ensure_dict, homogeneous_deepmap, apply
 
 
@@ -775,3 +775,59 @@
         raise ValueError("Shapes do not align %s" % g)
 
     return toolz.valmap(toolz.first, g2)
+
+
+def fuse_roots(graph: HighLevelGraph, keys: list):
+    """
+    Fuse nearby layers if they don't have dependencies
+
+    Often Blockwise sections of the graph fill out all of the computation
+    except for the initial data access or data loading layers::
+
+      Large Blockwise Layer
+        |       |       |
+        X       Y       Z
+
+    This can be troublesome because X, Y, and Z tasks may be executed on
+    different machines, and then require communication to move around.
+
+    This optimization identifies this situation, lowers all of the graphs to
+    concrete dicts, and then calls ``fuse`` on them, with a width equal to the
+    number of layers like X, Y, and Z.
+
+    This is currently used within array and dataframe optimizations.
+
+    Parameters
+    ----------
+    graph: HighLevelGraph
+        The full graph of the computation
+    keys: list
+        The output keys of the comptuation, to be passed on to fuse
+
+    See Also
+    --------
+    Blockwise
+    fuse
+    """
+    layers = graph.layers.copy()
+    dependencies = graph.dependencies.copy()
+    dependents = reverse_dict(dependencies)
+
+    for name, layer in graph.layers.items():
+        deps = graph.dependencies[name]
+        if (
+            isinstance(layer, Blockwise)
+            and len(deps) > 1
+            and not any(dependencies[dep] for dep in deps)  # no need to fuse 
if 0 or 1
+            and all(len(dependents[dep]) == 1 for dep in deps)
+        ):
+            new = toolz.merge(layer, *[layers[dep] for dep in deps])
+            new, _ = fuse(new, keys, ave_width=len(deps))
+
+            for dep in deps:
+                del layers[dep]
+
+            layers[name] = new
+            dependencies[name] = set()
+
+    return HighLevelGraph(layers, dependencies)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/dataframe/backends.py 
new/dask-2.8.0/dask/dataframe/backends.py
--- old/dask-2.7.0/dask/dataframe/backends.py   2019-11-08 20:58:43.000000000 
+0100
+++ new/dask-2.8.0/dask/dataframe/backends.py   2019-11-14 20:24:20.000000000 
+0100
@@ -15,4 +15,4 @@
 @meta_nonempty.register_lazy("cudf")
 @make_meta.register_lazy("cudf")
 def _register_cudf():
-    import cudf  # noqa: F401
+    import dask_cudf  # noqa: F401
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/dataframe/core.py 
new/dask-2.8.0/dask/dataframe/core.py
--- old/dask-2.7.0/dask/dataframe/core.py       2019-11-05 22:48:30.000000000 
+0100
+++ new/dask-2.8.0/dask/dataframe/core.py       2019-11-13 18:07:07.000000000 
+0100
@@ -717,7 +717,7 @@
         2017-01-08    13.0
         2017-01-09    15.0
         2017-01-10    17.0
-        dtype: float64
+        Freq: D, dtype: float64
         """
         from .rolling import map_overlap
 
@@ -2079,7 +2079,7 @@
                     )
                 layer[(name, i)] = (aggregate, (cumpart._name, i), (cname, i))
             graph = HighLevelGraph.from_collections(
-                cname, layer, dependencies=[cumpart, cumlast]
+                name, layer, dependencies=[cumpart, cumlast]
             )
             result = new_dd_object(graph, name, chunk(self._meta), 
self.divisions)
             return handle_out(out, result)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/dataframe/groupby.py 
new/dask-2.8.0/dask/dataframe/groupby.py
--- old/dask-2.7.0/dask/dataframe/groupby.py    2019-11-05 22:48:30.000000000 
+0100
+++ new/dask-2.8.0/dask/dataframe/groupby.py    2019-11-13 18:07:11.000000000 
+0100
@@ -162,22 +162,25 @@
     return df.groupby(**kwargs)
 
 
-def _groupby_slice_apply(df, grouper, key, func, *args, **kwargs):
+def _groupby_slice_apply(
+    df, grouper, key, func, *args, group_keys=True, dropna=None, **kwargs
+):
     # No need to use raise if unaligned here - this is only called after
     # shuffling, which makes everything aligned already
-    group_keys = kwargs.pop("group_keys", True)
-    g = df.groupby(grouper, group_keys=group_keys)
+    dropna = {"dropna": dropna} if dropna is not None else {}
+    g = df.groupby(grouper, group_keys=group_keys, **dropna)
     if key:
         g = g[key]
     return g.apply(func, *args, **kwargs)
 
 
-def _groupby_slice_transform(df, grouper, key, func, *args, **kwargs):
+def _groupby_slice_transform(
+    df, grouper, key, func, *args, group_keys=True, dropna=None, **kwargs
+):
     # No need to use raise if unaligned here - this is only called after
     # shuffling, which makes everything aligned already
-    group_keys = kwargs.pop("group_keys", True)
-
-    g = df.groupby(grouper, group_keys=group_keys)
+    dropna = {"dropna": dropna} if dropna is not None else {}
+    g = df.groupby(grouper, group_keys=group_keys, **dropna)
     if key:
         g = g[key]
 
@@ -271,15 +274,16 @@
         self.__name__ = name
 
 
-def _groupby_aggregate(df, aggfunc=None, levels=None, **kwargs):
-    return aggfunc(df.groupby(level=levels, sort=False), **kwargs)
+def _groupby_aggregate(df, aggfunc=None, levels=None, dropna=None, **kwargs):
+    dropna = {"dropna": dropna} if dropna is not None else {}
+    return aggfunc(df.groupby(level=levels, sort=False, **dropna), **kwargs)
 
 
-def _apply_chunk(df, *index, **kwargs):
+def _apply_chunk(df, *index, dropna=None, **kwargs):
     func = kwargs.pop("chunk")
     columns = kwargs.pop("columns")
-
-    g = _groupby_raise_unaligned(df, by=index)
+    dropna = {"dropna": dropna} if dropna is not None else {}
+    g = _groupby_raise_unaligned(df, by=index, **dropna)
 
     if is_series_like(df) or columns is None:
         return func(g, **kwargs)
@@ -980,9 +984,11 @@
         The slice keys applied to GroupBy result
     group_keys: bool
         Passed to pandas.DataFrame.groupby()
+    dropna: bool
+        Whether to drop null values from groupby index
     """
 
-    def __init__(self, df, by=None, slice=None, group_keys=True):
+    def __init__(self, df, by=None, slice=None, group_keys=True, dropna=None):
 
         assert isinstance(df, (DataFrame, Series))
         self.group_keys = group_keys
@@ -1020,7 +1026,13 @@
         else:
             index_meta = self.index
 
-        self._meta = self.obj._meta.groupby(index_meta, group_keys=group_keys)
+        self.dropna = {}
+        if dropna is not None:
+            self.dropna["dropna"] = dropna
+
+        self._meta = self.obj._meta.groupby(
+            index_meta, group_keys=group_keys, **self.dropna
+        )
 
     @property
     def _meta_nonempty(self):
@@ -1041,7 +1053,7 @@
         else:
             index_meta = self.index
 
-        grouped = sample.groupby(index_meta, group_keys=self.group_keys)
+        grouped = sample.groupby(index_meta, group_keys=self.group_keys, 
**self.dropna)
         return _maybe_slice(grouped, self._slice)
 
     def _aca_agg(
@@ -1068,12 +1080,16 @@
             if not isinstance(self.index, list)
             else [self.obj] + self.index,
             chunk=_apply_chunk,
-            chunk_kwargs=dict(chunk=func, columns=columns, **chunk_kwargs),
+            chunk_kwargs=dict(
+                chunk=func, columns=columns, **chunk_kwargs, **self.dropna
+            ),
             aggregate=_groupby_aggregate,
             meta=meta,
             token=token,
             split_every=split_every,
-            aggregate_kwargs=dict(aggfunc=aggfunc, levels=levels, 
**aggregate_kwargs),
+            aggregate_kwargs=dict(
+                aggfunc=aggfunc, levels=levels, **aggregate_kwargs, 
**self.dropna
+            ),
             split_out=split_out,
             split_out_setup=split_out_on_index,
         )
@@ -1097,7 +1113,8 @@
             chunk=chunk,
             columns=columns,
             token=name_part,
-            meta=meta
+            meta=meta,
+            **self.dropna
         )
 
         cumpart_raw_frame = (
@@ -1125,7 +1142,8 @@
             columns=0 if columns is None else columns,
             chunk=M.last,
             meta=meta,
-            token=name_last
+            token=name_last,
+            **self.dropna
         )
 
         # aggregate cumulated partitions and its previous last element
@@ -1585,6 +1603,7 @@
             token=funcname(func),
             *args,
             group_keys=self.group_keys,
+            **self.dropna,
             **kwargs
         )
 
@@ -1672,6 +1691,7 @@
             token=funcname(func),
             *args,
             group_keys=self.group_keys,
+            **self.dropna,
             **kwargs
         )
 
@@ -1684,9 +1704,9 @@
 
     def __getitem__(self, key):
         if isinstance(key, list):
-            g = DataFrameGroupBy(self.obj, by=self.index, slice=key)
+            g = DataFrameGroupBy(self.obj, by=self.index, slice=key, 
**self.dropna)
         else:
-            g = SeriesGroupBy(self.obj, by=self.index, slice=key)
+            g = SeriesGroupBy(self.obj, by=self.index, slice=key, 
**self.dropna)
 
         # error is raised from pandas
         g._meta = g._meta[key]
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/dataframe/optimize.py 
new/dask-2.8.0/dask/dataframe/optimize.py
--- old/dask-2.7.0/dask/dataframe/optimize.py   2019-11-07 03:43:11.000000000 
+0100
+++ new/dask-2.8.0/dask/dataframe/optimize.py   2019-11-13 18:07:07.000000000 
+0100
@@ -5,7 +5,7 @@
 from .. import config, core
 from ..highlevelgraph import HighLevelGraph
 from ..utils import ensure_dict
-from ..blockwise import optimize_blockwise, Blockwise
+from ..blockwise import optimize_blockwise, fuse_roots, Blockwise
 
 
 def optimize(dsk, keys, **kwargs):
@@ -14,6 +14,7 @@
         # Think about an API for this.
         dsk = optimize_read_parquet_getitem(dsk)
         dsk = optimize_blockwise(dsk, keys=list(core.flatten(keys)))
+        dsk = fuse_roots(dsk, keys=list(core.flatten(keys)))
 
     dsk = ensure_dict(dsk)
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/dataframe/tests/test_groupby.py 
new/dask-2.8.0/dask/dataframe/tests/test_groupby.py
--- old/dask-2.7.0/dask/dataframe/tests/test_groupby.py 2019-11-05 
22:48:30.000000000 +0100
+++ new/dask-2.8.0/dask/dataframe/tests/test_groupby.py 2019-11-13 
18:07:11.000000000 +0100
@@ -2187,3 +2187,60 @@
                 lambda series: series - series.mean()
             ),
         )
+
+
+@pytest.mark.xfail(reason="dropna kwarg not supported in pandas groupby.")
+@pytest.mark.parametrize("dropna", [False, True])
+def test_groupby_dropna_pandas(dropna):
+
+    # The `dropna` arg is not currently supported by pandas
+    # (See #https://github.com/pandas-dev/pandas/pull/21669)
+    # Dask supports the argument for the cudf backend,
+    # but passing it to the pandas backend will fail.
+
+    # TODO: Expand test when `dropna` is supported in pandas.
+    #       (See: `test_groupby_dropna_cudf`)
+
+    df = pd.DataFrame(
+        {"a": [1, 2, 3, 4, None, None, 7, 8], "e": [4, 5, 6, 3, 2, 1, 0, 0]}
+    )
+    ddf = dd.from_pandas(df, npartitions=3)
+
+    dask_result = ddf.groupby("a", dropna=dropna)
+    pd_result = df.groupby("a", dropna=dropna)
+    assert_eq(dask_result, pd_result)
+
+
+@pytest.mark.parametrize("dropna", [False, True, None])
+@pytest.mark.parametrize("by", ["a", "c", "d", ["a", "b"], ["a", "c"], ["a", 
"d"]])
+def test_groupby_dropna_cudf(dropna, by):
+
+    # NOTE: This test requires cudf/dask_cudf, and will
+    # be skipped by non-GPU CI
+
+    cudf = pytest.importorskip("cudf")
+    dask_cudf = pytest.importorskip("dask_cudf")
+
+    df = cudf.DataFrame(
+        {
+            "a": [1, 2, 3, 4, None, None, 7, 8],
+            "b": [1, 0] * 4,
+            "c": ["a", "b", None, None, "e", "f", "g", "h"],
+            "e": [4, 5, 6, 3, 2, 1, 0, 0],
+        }
+    )
+    df["d"] = df["c"].astype("category")
+    ddf = dask_cudf.from_cudf(df, npartitions=3)
+
+    if dropna is None:
+        dask_result = ddf.groupby(by).e.sum()
+        cudf_result = df.groupby(by).e.sum()
+    else:
+        dask_result = ddf.groupby(by, dropna=dropna).e.sum()
+        cudf_result = df.groupby(by, dropna=dropna).e.sum()
+    if by in ["c", "d"]:
+        # Loose string/category index name in cudf...
+        dask_result = dask_result.compute()
+        dask_result.index.name = cudf_result.index.name
+
+    assert_eq(dask_result, cudf_result)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/diagnostics/profile_visualize.py 
new/dask-2.8.0/dask/diagnostics/profile_visualize.py
--- old/dask-2.7.0/dask/diagnostics/profile_visualize.py        2019-10-11 
05:14:07.000000000 +0200
+++ new/dask-2.8.0/dask/diagnostics/profile_visualize.py        2019-11-12 
16:54:02.000000000 +0100
@@ -341,6 +341,7 @@
     The completed bokeh plot object.
     """
     bp = import_required("bokeh.plotting", _BOKEH_MISSING_MSG)
+    import bokeh
     from bokeh import palettes
     from bokeh.models import LinearAxis, Range1d
 
@@ -365,7 +366,17 @@
         t = mem = cpu = []
         p = bp.figure(y_range=(0, 100), x_range=(0, 1), **defaults)
     colors = palettes.all_palettes[palette][6]
-    p.line(t, cpu, color=colors[0], line_width=4, legend="% CPU")
+    p.line(
+        t,
+        cpu,
+        color=colors[0],
+        line_width=4,
+        **{
+            "legend_label"
+            if LooseVersion(bokeh.__version__) >= "1.4"
+            else "legend": "% CPU"
+        }
+    )
     p.yaxis.axis_label = "% CPU"
     p.extra_y_ranges = {
         "memory": Range1d(
@@ -373,7 +384,16 @@
         )
     }
     p.line(
-        t, mem, color=colors[2], y_range_name="memory", line_width=4, 
legend="Memory"
+        t,
+        mem,
+        color=colors[2],
+        y_range_name="memory",
+        line_width=4,
+        **{
+            "legend_label"
+            if LooseVersion(bokeh.__version__) >= "1.4"
+            else "legend": "Memory"
+        }
     )
     p.add_layout(LinearAxis(y_range_name="memory", axis_label="Memory (MB)"), 
"right")
     p.xaxis.axis_label = "Time (s)"
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/diagnostics/tests/test_profiler.py 
new/dask-2.8.0/dask/diagnostics/tests/test_profiler.py
--- old/dask-2.7.0/dask/diagnostics/tests/test_profiler.py      2019-10-11 
05:14:07.000000000 +0200
+++ new/dask-2.8.0/dask/diagnostics/tests/test_profiler.py      2019-11-12 
16:54:02.000000000 +0100
@@ -362,13 +362,12 @@
 @ignore_abc_warning
 def test_get_colors():
     from dask.diagnostics.profile_visualize import get_colors
-    from bokeh.palettes import Blues9, Blues5, Viridis
-    from itertools import cycle
+    from bokeh.palettes import Blues256, Blues5, Viridis
 
     funcs = list(range(11))
     cmap = get_colors("Blues", funcs)
-    lk = dict(zip(funcs, cycle(Blues9)))
-    assert cmap == [lk[i] for i in funcs]
+    assert set(cmap) < set(Blues256)
+    assert len(set(cmap)) == 11
 
     funcs = list(range(5))
     cmap = get_colors("Blues", funcs)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/optimization.py 
new/dask-2.8.0/dask/optimization.py
--- old/dask-2.7.0/dask/optimization.py 2019-11-07 03:43:11.000000000 +0100
+++ new/dask-2.8.0/dask/optimization.py 2019-11-13 18:07:07.000000000 +0100
@@ -1,4 +1,5 @@
 import math
+import numbers
 import re
 
 from . import config, core
@@ -516,6 +517,11 @@
     reducible = {k for k, vals in rdeps.items() if len(vals) == 1}
     if keys:
         reducible -= keys
+
+    for k, v in dsk.items():
+        if type(v) is not tuple and not isinstance(v, (numbers.Number, str)):
+            reducible.discard(k)
+
     if not reducible and (
         not fuse_subgraphs or all(len(set(v)) != 1 for v in rdeps.values())
     ):
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/sizeof.py 
new/dask-2.8.0/dask/sizeof.py
--- old/dask-2.7.0/dask/sizeof.py       2019-11-05 22:48:30.000000000 +0100
+++ new/dask-2.8.0/dask/sizeof.py       2019-11-13 18:07:11.000000000 +0100
@@ -28,6 +28,14 @@
     return getsizeof(seq) + sum(map(sizeof, seq))
 
 
+@sizeof.register(dict)
+def sizeof_python_dict(d):
+    if len(d) > 10:
+        return getsizeof(d) + 1000 * len(d)
+    else:
+        return getsizeof(d) + sum(map(sizeof, d.keys())) + sum(map(sizeof, 
d.values()))
+
+
 @sizeof.register_lazy("cupy")
 def register_cupy():
     import cupy
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/tests/test_base.py 
new/dask-2.8.0/dask/tests/test_base.py
--- old/dask-2.7.0/dask/tests/test_base.py      2019-11-08 20:58:43.000000000 
+0100
+++ new/dask-2.8.0/dask/tests/test_base.py      2019-11-13 18:07:07.000000000 
+0100
@@ -582,9 +582,7 @@
     # Otherwise, the lengths below would be 4 and 0.
     assert len([k for k in keys if "mul" in k[0]]) == 8
     assert len([k for k in keys if "add" in k[0]]) == 4
-    assert (
-        len([k for k in keys if "add-from_sequence-mul" in k[0]]) == 4
-    )  # See? Renamed
+    assert len([k for k in keys if "add-mul" in k[0]]) == 4  # See? Renamed
 
 
 @pytest.mark.skipif("not da")
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/tests/test_optimization.py 
new/dask-2.8.0/dask/tests/test_optimization.py
--- old/dask-2.7.0/dask/tests/test_optimization.py      2019-11-07 
03:43:11.000000000 +0100
+++ new/dask-2.8.0/dask/tests/test_optimization.py      2019-11-13 
18:07:07.000000000 +0100
@@ -1285,3 +1285,15 @@
         }
     )
     assert res == sol
+
+
+def test_dont_fuse_numpy_arrays():
+    """
+    Some types should stay in the graph bare
+
+    This helps with things like serialization
+    """
+    np = pytest.importorskip("numpy")
+    dsk = {"x": np.arange(5), "y": (inc, "x")}
+
+    assert fuse(dsk, "y")[0] == dsk
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask/tests/test_sizeof.py 
new/dask-2.8.0/dask/tests/test_sizeof.py
--- old/dask-2.7.0/dask/tests/test_sizeof.py    2019-11-05 22:48:30.000000000 
+0100
+++ new/dask-2.8.0/dask/tests/test_sizeof.py    2019-11-13 18:07:11.000000000 
+0100
@@ -121,3 +121,11 @@
     assert sizeof(empty.columns[0]) > 0
     assert sizeof(empty.columns[1]) > 0
     assert sizeof(empty.columns[2]) > 0
+
+
+def test_dict():
+    np = pytest.importorskip("numpy")
+    x = np.ones(10000)
+    assert sizeof({"x": x}) > x.nbytes
+    assert sizeof({"x": [x]}) > x.nbytes
+    assert sizeof({"x": [{"y": x}]}) > x.nbytes
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask.egg-info/PKG-INFO 
new/dask-2.8.0/dask.egg-info/PKG-INFO
--- old/dask-2.7.0/dask.egg-info/PKG-INFO       2019-11-08 22:06:23.000000000 
+0100
+++ new/dask-2.8.0/dask.egg-info/PKG-INFO       2019-11-14 23:57:18.000000000 
+0100
@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: dask
-Version: 2.7.0
+Version: 2.8.0
 Summary: Parallel PyData with Task Scheduling
 Home-page: https://github.com/dask/dask/
 Maintainer: Matthew Rocklin
@@ -43,10 +43,10 @@
 Classifier: Programming Language :: Python :: 3.6
 Classifier: Programming Language :: Python :: 3.7
 Requires-Python: >=3.6
-Provides-Extra: dataframe
-Provides-Extra: delayed
+Provides-Extra: complete
 Provides-Extra: array
-Provides-Extra: distributed
 Provides-Extra: diagnostics
+Provides-Extra: dataframe
 Provides-Extra: bag
-Provides-Extra: complete
+Provides-Extra: delayed
+Provides-Extra: distributed
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/dask.egg-info/requires.txt 
new/dask-2.8.0/dask.egg-info/requires.txt
--- old/dask-2.7.0/dask.egg-info/requires.txt   2019-11-08 22:06:23.000000000 
+0100
+++ new/dask-2.8.0/dask.egg-info/requires.txt   2019-11-14 23:57:18.000000000 
+0100
@@ -5,7 +5,7 @@
 
 [bag]
 cloudpickle>=0.2.1
-fsspec>=0.5.1
+fsspec>=0.6.0
 toolz>=0.7.3
 partd>=0.3.10
 
@@ -14,7 +14,7 @@
 bokeh>=1.0.0
 cloudpickle>=0.2.1
 distributed>=2.0
-fsspec>=0.5.1
+fsspec>=0.6.0
 numpy>=1.13.0
 pandas>=0.21.0
 partd>=0.3.10
@@ -25,7 +25,7 @@
 pandas>=0.21.0
 toolz>=0.7.3
 partd>=0.3.10
-fsspec>=0.5.1
+fsspec>=0.6.0
 
 [delayed]
 cloudpickle>=0.2.1
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/docs/source/best-practices.rst 
new/dask-2.8.0/docs/source/best-practices.rst
--- old/dask-2.7.0/docs/source/best-practices.rst       2019-10-11 
05:14:07.000000000 +0200
+++ new/dask-2.8.0/docs/source/best-practices.rst       2019-11-13 
18:07:11.000000000 +0100
@@ -334,3 +334,28 @@
    for item in L:
        result = process(item, df)  # include pointer to df in every delayed 
call
        results.append(result)
+
+
+Avoid calling compute repeatedly
+--------------------------------
+
+Compute related results with shared computations in a single 
:func:`dask.compute` call
+
+.. code-block:: python
+
+   # Don't repeatedly call compute
+
+   df = dd.read_csv("...")
+   xmin = df.x.min().compute()
+   xmax = df.x.max().compute()
+
+.. code-block:: python
+
+   # Do compute multiple results at the same time
+
+   df = dd.read_csv("...")
+
+   xmin, xmax = dask.compute(df.x.min(), df.x.max())
+
+This allows Dask to compute the shared parts of the computation (like the
+``dd.read_csv`` call above) only once, rather than once per ``compute`` call.
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/docs/source/changelog.rst 
new/dask-2.8.0/docs/source/changelog.rst
--- old/dask-2.7.0/docs/source/changelog.rst    2019-11-08 21:16:03.000000000 
+0100
+++ new/dask-2.8.0/docs/source/changelog.rst    2019-11-14 23:55:24.000000000 
+0100
@@ -1,6 +1,43 @@
 Changelog
 =========
 
+2.8.0 / 2019-11-14
+------------------
+
+Array
++++++
+-  Implement complete dask.array.tile function (:pr:`5574`) `Bouwe Andela`_
+-  Add median along an axis with automatic rechunking (:pr:`5575`) `Matthew 
Rocklin`_
+-  Allow da.asarray to chunk inputs (:pr:`5586`) `Matthew Rocklin`_
+
+Bag
++++
+
+-  Use key_split in Bag name (:pr:`5571`) `Matthew Rocklin`_
+
+Core
+++++
+-  Switch Doctests to Py3.7 (:pr:`5573`) `Ryan Nazareth`_
+-  Relax get_colors test to adapt to new Bokeh release (:pr:`5576`) `Matthew 
Rocklin`_
+-  Add dask.blockwise.fuse_roots optimization (:pr:`5451`) `Matthew Rocklin`_
+-  Add sizeof implementation for small dicts (:pr:`5578`) `Matthew Rocklin`_
+-  Update fsspec, gcsfs, s3fs (:pr:`5588`) `Tom Augspurger`_
+
+DataFrame
++++++++++
+-  Add dropna argument to groupby (:pr:`5579`) `Richard J Zamora`_
+-  Revert "Remove import of dask_cudf, which is now a part of cudf 
(:pr:`5568`)" (:pr:`5590`) `Matthew Rocklin`_
+
+Documentation
++++++++++++++
+
+-  Add best practice for dask.compute function (:pr:`5583`) `Matthew Rocklin`_
+-  Create FUNDING.yml (:pr:`5587`) `Gina Helfrich`_
+-  Add screencast for coordination primitives (:pr:`5593`) `Matthew Rocklin`_
+-  Move funding to .github repo (:pr:`5589`) `Tom Augspurger`_
+-  Update calendar link (:pr:`5569`) `Tom Augspurger`_
+
+
 2.7.0 / 2019-11-08
 ------------------
 
@@ -37,8 +74,8 @@
 -  Explicitly use iloc for row indexing (:pr:`5500`) `Krishan Bhasin`_
 -  Accept dask arrays on columns assignemnt (:pr:`5224`) `Henrique Ribeiro`-
 -  Implement unique and value_counts for SeriesGroupBy (:pr:`5358`) `Scott 
Sievert`_
--  Add sizeof definition for pyarrow tables and columns (:pr:`5522`) `Richard 
J  Zamora`_
--  Enable row-group task partitioning in pyarrow-based read_parquet 
(:pr:`5508`) `Richard J  Zamora`_
+-  Add sizeof definition for pyarrow tables and columns (:pr:`5522`) `Richard 
J Zamora`_
+-  Enable row-group task partitioning in pyarrow-based read_parquet 
(:pr:`5508`) `Richard J Zamora`_
 -  Removes npartitions='auto' from dd.merge docstring (:pr:`5531`) `James 
Bourbeau`_
 -  Apply enforce error message shows non-overlapping columns. (:pr:`5530`) 
`Tom Augspurger`_
 -  Optimize meta_nonempty for repetitive dtypes (:pr:`5553`) `Petio Petrov`_
@@ -2662,3 +2699,4 @@
 .. _`Mads R. B. Kristensen`: https://github.com/madsbk
 .. _`Prithvi MK`: https://github.com/pmk21
 .. _`Eric Dill`: https://github.com/ericdill
+.. _`Gina Helfrich`: https://github.com/Dr-G
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/docs/source/futures.rst 
new/dask-2.8.0/docs/source/futures.rst
--- old/dask-2.7.0/docs/source/futures.rst      2019-10-11 05:14:07.000000000 
+0200
+++ new/dask-2.8.0/docs/source/futures.rst      2019-11-14 20:24:20.000000000 
+0100
@@ -446,6 +446,16 @@
 resources, track progress of ongoing computations, or share data in
 side-channels between many workers, clients, and tasks sensibly.
 
+.. raw:: html
+
+   <iframe width="560"
+           height="315"
+           src="https://www.youtube.com/embed/Q-Y3BR1u7c0";
+           style="margin: 0 auto 20px auto; display: block;"
+           frameborder="0"
+           allow="accelerometer; autoplay; encrypted-media; gyroscope; 
picture-in-picture"
+           allowfullscreen></iframe>
+
 These features are rarely necessary for common use of Dask.  We recommend that
 beginning users stick with using the simpler futures found above (like
 ``Client.submit`` and ``Client.gather``) rather than embracing needlessly
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/docs/source/install.rst 
new/dask-2.8.0/docs/source/install.rst
--- old/dask-2.7.0/docs/source/install.rst      2019-11-05 22:48:30.000000000 
+0100
+++ new/dask-2.8.0/docs/source/install.rst      2019-11-13 21:17:45.000000000 
+0100
@@ -96,9 +96,9 @@
 
+-------------+----------+--------------------------------------------------------------+
 | fastparquet |          |         Storing and reading data from parquet files 
         |
 
+-------------+----------+--------------------------------------------------------------+
-|    fsspec   | >=0.5.1  |          Used for local, cluster and remote data IO 
         |
+|    fsspec   | >=0.6.0  |          Used for local, cluster and remote data IO 
         |
 
+-------------+----------+--------------------------------------------------------------+
-|    gcsfs    |          |        File-system interface to Google Cloud 
Storage         |
+|    gcsfs    | >=0.4.0  |        File-system interface to Google Cloud 
Storage         |
 
+-------------+----------+--------------------------------------------------------------+
 |  murmurhash |          |                   Faster hashing of arrays          
         |
 
+-------------+----------+--------------------------------------------------------------+
@@ -112,7 +112,7 @@
 
+-------------+----------+--------------------------------------------------------------+
 |    pyarrow  | >=0.14.0 |               Python library for Apache Arrow       
         |
 
+-------------+----------+--------------------------------------------------------------+
-|    s3fs     |          |                    Reading from Amazon S3           
         |
+|    s3fs     | >=0.4.0  |                    Reading from Amazon S3           
         |
 
+-------------+----------+--------------------------------------------------------------+
 |  sqlalchemy |          |            Writing and reading from SQL databases   
         |
 
+-------------+----------+--------------------------------------------------------------+
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/docs/source/support.rst 
new/dask-2.8.0/docs/source/support.rst
--- old/dask-2.7.0/docs/source/support.rst      2019-11-05 22:48:30.000000000 
+0100
+++ new/dask-2.8.0/docs/source/support.rst      2019-11-14 20:24:20.000000000 
+0100
@@ -23,13 +23,16 @@
     Overflow or GitHub.
 4.  **Monthly developer meeting** happens the first Thursday of the month at
     11:00 US Central Time in `this video meeting 
<https://zoom.us/j/802251830>`_.
-    Subscribe to `this Google Calendar invite`_ to be notified of changes to
-    the meeting schedule. Meeting notes are available at
+    Meeting notes are available at
     
https://docs.google.com/document/d/1UqNAP87a56ERH_xkQsS5Q_0PKYybd5Lj2WANy_hRzI0/edit
 
+    You can subscribe to this calendar to be notified of changes:
+
+    * `Google Calendar 
<https://calendar.google.com/calendar/embed?src=4l0vts0c1cgdbq5jhcogj55sfs%40group.calendar.google.com&ctz=America%2FChicago>`__
+    * `iCal 
<https://calendar.google.com/calendar/ical/4l0vts0c1cgdbq5jhcogj55sfs%40group.calendar.google.com/public/basic.ics>`__
+
 .. _`Stack Overflow with the #dask tag`: 
https://stackoverflow.com/questions/tagged/dask
 .. _`GitHub issue tracker`: https://github.com/dask/dask/issues/
-.. _`this Google Calendar invite`: 
https://calendar.google.com/event?action=TEMPLATE&tmeid=NmxnamVvcGtjY3E2NGI5bTZzcW1hYjlrYzhybTZiYjFjY29qOGI5ZzY0cWoyYzFrNjFpMzhwaGlja18yMDE5MDYwNlQxNjAwMDBaIDRsMHZ0czBjMWNnZGJxNWpoY29najU1c2ZzQGc&tmsrc=4l0vts0c1cgdbq5jhcogj55sfs%40group.calendar.google.com&scp=ALL
 
 
 Asking for help
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/dask-2.7.0/setup.py new/dask-2.8.0/setup.py
--- old/dask-2.7.0/setup.py     2019-11-08 20:58:43.000000000 +0100
+++ new/dask-2.8.0/setup.py     2019-11-14 23:56:25.000000000 +0100
@@ -11,7 +11,7 @@
     "array": ["numpy >= 1.13.0", "toolz >= 0.7.3"],
     "bag": [
         "cloudpickle >= 0.2.1",
-        "fsspec >= 0.5.1",
+        "fsspec >= 0.6.0",
         "toolz >= 0.7.3",
         "partd >= 0.3.10"
     ],
@@ -20,7 +20,7 @@
         "pandas >= 0.21.0",
         "toolz >= 0.7.3",
         "partd >= 0.3.10",
-        "fsspec >= 0.5.1",
+        "fsspec >= 0.6.0",
     ],
     "distributed": ["distributed >= 2.0"],
     "diagnostics": ["bokeh >= 1.0.0"],

commit python-dask for openSUSE:Factory

Reply via email to