[GitHub] spark issue #21157: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...

superbobry Fri, 12 Oct 2018 13:29:06 -0700

Github user superbobry commented on the issue:

    https://github.com/apache/spark/pull/21157
  
    Nope, the job I was referring to is not open source; but I guess the 
speedup is easy to justify: much less payload and faster deserialization:
    
    ```
    >>> from collections import namedtuple
    >>> Stats = namedtuple("Stats", ["sample_mean", "sample_variance"])
    >>> import pickle
    >>> len(pickle.dumps(Stats(42, 42)))
    31
    >>> len(pickle.dumps(("Stats", Stats._fields, (42, 42))))
    68
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21157: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...

Reply via email to