Re: Support for numpy matrix
Hi! I agree. This is in fact all information which is already there. :-) Mitar On Sat, Mar 30, 2019 at 8:40 PM Wes McKinney wrote: > > hi Mitar, > > Let's discuss further on JIRA? It's best to keep all the information > about the issue in one place. > > Thanks > > On Sat, Mar 30, 2019 at 7:42 PM Mitar wrote: > > > > Hi! > > > > I added: > > > > serialization_context.register_type( > > np.matrix, 'np.matrix', > > custom_serializer=_serialize_numpy_array_list, > > custom_deserializer=_deserialize_numpy_array_list) > > > > But it did not help. Probably also because np.matrix is a subclas of > > np.ndarray anyway. So no change here. > > > > An interesting fact is that this worked in older versions of numpy, > > but stopped in numpy 1.15.2. It works with numpy 1.14.3. So it is them > > changing something. > > > > > > Mitar > > > > On Sat, Mar 30, 2019 at 3:34 PM Philipp Moritz wrote: > > > > > > Hey Mitar, > > > > > > It might be as simple as adding a handler here: > > > https://github.com/apache/arrow/blob/master/python/pyarrow/serialization.py#L300 > > > > > > Do you want to try that? > > > > > > -- Philipp. > > > > > > On Sat, Mar 30, 2019 at 3:22 PM Mitar wrote: > > > > > > > Hi! > > > > > > > > I do not know where to start looking into this? Not sure if I have > > > > enough knowledge about arrow to be able to make a PR. > > > > > > > > > > > > Miar > > > > > > > > On Sat, Mar 30, 2019 at 3:17 PM Wes McKinney > > > > wrote: > > > > > > > > > > hi Mitar, > > > > > > > > > > I see you reported the issue on October 2 and no one has volunteered > > > > > to fix it yet. Are you up to submit a PR? > > > > > > > > > > Thanks > > > > > Wes > > > > > > > > > > On Sat, Mar 30, 2019 at 5:14 PM Mitar wrote: > > > > > > > > > > > > Hi! > > > > > > > > > > > > It seems numpy's matrix is not supported in recent versions of > > > > > > pyarrow: > > > > > > > > > > > > https://issues.apache.org/jira/browse/ARROW-3399 > > > > > > > > > > > > Any ideas why this would be happening? > > > > > > > > > > > > > > > > > > Mitar > > > > > > > > > > > > -- > > > > > > http://mitar.tnode.com/ > > > > > > https://twitter.com/mitar_m > > > > > > > > > > > > > > > > -- > > > > http://mitar.tnode.com/ > > > > https://twitter.com/mitar_m > > > > > > > > > > > > -- > > http://mitar.tnode.com/ > > https://twitter.com/mitar_m -- http://mitar.tnode.com/ https://twitter.com/mitar_m
Re: Support for numpy matrix
Hi! I added: serialization_context.register_type( np.matrix, 'np.matrix', custom_serializer=_serialize_numpy_array_list, custom_deserializer=_deserialize_numpy_array_list) But it did not help. Probably also because np.matrix is a subclas of np.ndarray anyway. So no change here. An interesting fact is that this worked in older versions of numpy, but stopped in numpy 1.15.2. It works with numpy 1.14.3. So it is them changing something. Mitar On Sat, Mar 30, 2019 at 3:34 PM Philipp Moritz wrote: > > Hey Mitar, > > It might be as simple as adding a handler here: > https://github.com/apache/arrow/blob/master/python/pyarrow/serialization.py#L300 > > Do you want to try that? > > -- Philipp. > > On Sat, Mar 30, 2019 at 3:22 PM Mitar wrote: > > > Hi! > > > > I do not know where to start looking into this? Not sure if I have > > enough knowledge about arrow to be able to make a PR. > > > > > > Miar > > > > On Sat, Mar 30, 2019 at 3:17 PM Wes McKinney wrote: > > > > > > hi Mitar, > > > > > > I see you reported the issue on October 2 and no one has volunteered > > > to fix it yet. Are you up to submit a PR? > > > > > > Thanks > > > Wes > > > > > > On Sat, Mar 30, 2019 at 5:14 PM Mitar wrote: > > > > > > > > Hi! > > > > > > > > It seems numpy's matrix is not supported in recent versions of pyarrow: > > > > > > > > https://issues.apache.org/jira/browse/ARROW-3399 > > > > > > > > Any ideas why this would be happening? > > > > > > > > > > > > Mitar > > > > > > > > -- > > > > http://mitar.tnode.com/ > > > > https://twitter.com/mitar_m > > > > > > > > -- > > http://mitar.tnode.com/ > > https://twitter.com/mitar_m > > -- http://mitar.tnode.com/ https://twitter.com/mitar_m
Re: Support for numpy matrix
Hi! I do not know where to start looking into this? Not sure if I have enough knowledge about arrow to be able to make a PR. Miar On Sat, Mar 30, 2019 at 3:17 PM Wes McKinney wrote: > > hi Mitar, > > I see you reported the issue on October 2 and no one has volunteered > to fix it yet. Are you up to submit a PR? > > Thanks > Wes > > On Sat, Mar 30, 2019 at 5:14 PM Mitar wrote: > > > > Hi! > > > > It seems numpy's matrix is not supported in recent versions of pyarrow: > > > > https://issues.apache.org/jira/browse/ARROW-3399 > > > > Any ideas why this would be happening? > > > > > > Mitar > > > > -- > > http://mitar.tnode.com/ > > https://twitter.com/mitar_m -- http://mitar.tnode.com/ https://twitter.com/mitar_m
Support for numpy matrix
Hi! It seems numpy's matrix is not supported in recent versions of pyarrow: https://issues.apache.org/jira/browse/ARROW-3399 Any ideas why this would be happening? Mitar -- http://mitar.tnode.com/ https://twitter.com/mitar_m
Efficient Pandas serialization for mixed object and numeric DataFrames
Hi! It seems that if a DataFrame contains both numeric and object columns, the whole DataFrame is pickled and not that only object columns are pickled? Is this right? Are there any plans to improve this? Mitar -- http://mitar.tnode.com/ https://twitter.com/mitar_m
[jira] [Created] (ARROW-3399) Cannot serialize numpy matrix object
Mitar created ARROW-3399: Summary: Cannot serialize numpy matrix object Key: ARROW-3399 URL: https://issues.apache.org/jira/browse/ARROW-3399 Project: Apache Arrow Issue Type: Bug Affects Versions: 0.10.0 Reporter: Mitar This is a regression from 0.9.0 and happens with 0.10.0 with Python 3.6.5 on Linux. {code:java} from pyarrow import plasma import numpy import time import subprocess import os import signal m = numpy.matrix(numpy.array([[1, 2], [3, 4]])) process = subprocess.Popen(['plasma_store', '-m', '100', '-s', '/tmp/plasma', '-d', '/dev/shm'], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, encoding='utf8', preexec_fn=os.setpgrp) time.sleep(5) client = plasma.connect('/tmp/plasma', '', 0) try: client.put(m) finally: client.disconnect() os.killpg(os.getpgid(process.pid), signal.SIGTERM) {code} Error: {noformat} File "pyarrow/_plasma.pyx", line 397, in pyarrow._plasma.PlasmaClient.put File "pyarrow/serialization.pxi", line 338, in pyarrow.lib.serialize File "pyarrow/error.pxi", line 89, in pyarrow.lib.check_status pyarrow.lib.ArrowNotImplementedError: This object exceeds the maximum recursion depth. It may contain itself recursively.{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [RESULT] [VOTE] Release Apache Arrow 0.9.0 (RC2)
Hi! Oh, no worries. Thanks for working on this. I just thought that because the website went up it is ready and thought that there is some bug there. I understand it takes time to do a release, properly. Mitar On Thu, Mar 22, 2018 at 11:35 AM, Phillip Cloud <cpcl...@gmail.com> wrote: > We are working on getting those wheels up as fast as we can. They should be > available very soon. In the meantime, you can install pyarrow 0.9.0 with > conda if you'd like. > > On Thu, Mar 22, 2018 at 2:19 PM Mitar <mmi...@gmail.com> wrote: > >> Hi! >> >> The website seems to say that there is already a pyarrow 0.9.0 >> package, but it does not seem to be there yet: >> >> https://arrow.apache.org/install/#python-wheels-on-pypi-unofficial >> https://pypi.python.org/pypi/pyarrow >> >> BTW, why are Python packages unofficial? >> >> >> Mitar >> >> On Thu, Mar 22, 2018 at 8:33 AM, Phillip Cloud <cpcl...@gmail.com> wrote: >> > I'm working on updating the API docs. >> > >> > On Wed, Mar 21, 2018 at 10:24 PM Wes McKinney <wesmck...@gmail.com> >> wrote: >> > >> >> I have put up blog posts to go out tomorrow about the Go code donation >> >> and the release >> >> >> >> https://github.com/apache/arrow/pull/1776 >> >> https://github.com/apache/arrow/pull/1777 >> >> >> >> Would someone like to take a crack at updating the generated API >> >> documentation? >> >> >> >> On Wed, Mar 21, 2018 at 10:42 AM, Wes McKinney <wesmck...@gmail.com> >> >> wrote: >> >> > If any items are missing from >> >> > >> >> >> https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md >> >> , >> >> > let's definitely add them. I'd like the RM process to be reasonably >> >> > fool-proof >> >> > >> >> > On Wed, Mar 21, 2018 at 10:07 AM, Phillip Cloud <cpcl...@gmail.com> >> >> wrote: >> >> >> Charles.Cloud >> >> >> >> >> >> On Wed, Mar 21, 2018 at 8:53 AM Uwe L. Korn <uw...@xhochy.com> >> wrote: >> >> >> >> >> >>> At least I have not. Philip, what is your login on pypi.python.org >> so >> >> I >> >> >>> can add you as a maintainer there? >> >> >>> >> >> >>> On Wed, Mar 21, 2018, at 1:49 PM, Phillip Cloud wrote: >> >> >>> > Has anyone started on pip wheels yet? If not, I will start >> cranking >> >> on it >> >> >>> > today. >> >> >>> > >> >> >>> > On Tue, Mar 20, 2018, 22:47 Wes McKinney <wesmck...@gmail.com> >> >> wrote: >> >> >>> > >> >> >>> > > I haven't been able to draft a release blog post yet. We also >> have >> >> >>> > > more packaging work to do. I suggest we announce Thursday >> morning >> >> and >> >> >>> > > try to get the packaging completed -- we have conda-forge done >> as >> >> of >> >> >>> > > right now, but pip and Java need to get uploaded. >> >> >>> > > >> >> >>> > > On Tue, Mar 20, 2018 at 2:58 AM, Siddharth Teotia < >> >> >>> siddha...@dremio.com> >> >> >>> > > wrote: >> >> >>> > > > FYI: Created a PR for website update. >> >> >>> > > > >> >> >>> > > > On Mon, Mar 19, 2018 at 3:38 PM, Phillip Cloud < >> >> cpcl...@gmail.com> >> >> >>> > > wrote: >> >> >>> > > > >> >> >>> > > >> Great! I'll volunteer to handle the conda-forge feedstock >> >> updates. >> >> >>> > > >> >> >> >>> > > >> On Mon, Mar 19, 2018 at 6:09 PM Wes McKinney < >> >> wesmck...@gmail.com> >> >> >>> > > wrote: >> >> >>> > > >> >> >> >>> > > >> > With 4 binding +1 votes, 2 non-binding +1, and no other >> >> votes, the >> >> >>> > > >> > vote passes. Thanks all! >> >> >>> > > >> > >> >> >>> > > >> > Let's get busy updating the C++,
Re: [RESULT] [VOTE] Release Apache Arrow 0.9.0 (RC2)
Hi! The website seems to say that there is already a pyarrow 0.9.0 package, but it does not seem to be there yet: https://arrow.apache.org/install/#python-wheels-on-pypi-unofficial https://pypi.python.org/pypi/pyarrow BTW, why are Python packages unofficial? Mitar On Thu, Mar 22, 2018 at 8:33 AM, Phillip Cloud <cpcl...@gmail.com> wrote: > I'm working on updating the API docs. > > On Wed, Mar 21, 2018 at 10:24 PM Wes McKinney <wesmck...@gmail.com> wrote: > >> I have put up blog posts to go out tomorrow about the Go code donation >> and the release >> >> https://github.com/apache/arrow/pull/1776 >> https://github.com/apache/arrow/pull/1777 >> >> Would someone like to take a crack at updating the generated API >> documentation? >> >> On Wed, Mar 21, 2018 at 10:42 AM, Wes McKinney <wesmck...@gmail.com> >> wrote: >> > If any items are missing from >> > >> https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md >> , >> > let's definitely add them. I'd like the RM process to be reasonably >> > fool-proof >> > >> > On Wed, Mar 21, 2018 at 10:07 AM, Phillip Cloud <cpcl...@gmail.com> >> wrote: >> >> Charles.Cloud >> >> >> >> On Wed, Mar 21, 2018 at 8:53 AM Uwe L. Korn <uw...@xhochy.com> wrote: >> >> >> >>> At least I have not. Philip, what is your login on pypi.python.org so >> I >> >>> can add you as a maintainer there? >> >>> >> >>> On Wed, Mar 21, 2018, at 1:49 PM, Phillip Cloud wrote: >> >>> > Has anyone started on pip wheels yet? If not, I will start cranking >> on it >> >>> > today. >> >>> > >> >>> > On Tue, Mar 20, 2018, 22:47 Wes McKinney <wesmck...@gmail.com> >> wrote: >> >>> > >> >>> > > I haven't been able to draft a release blog post yet. We also have >> >>> > > more packaging work to do. I suggest we announce Thursday morning >> and >> >>> > > try to get the packaging completed -- we have conda-forge done as >> of >> >>> > > right now, but pip and Java need to get uploaded. >> >>> > > >> >>> > > On Tue, Mar 20, 2018 at 2:58 AM, Siddharth Teotia < >> >>> siddha...@dremio.com> >> >>> > > wrote: >> >>> > > > FYI: Created a PR for website update. >> >>> > > > >> >>> > > > On Mon, Mar 19, 2018 at 3:38 PM, Phillip Cloud < >> cpcl...@gmail.com> >> >>> > > wrote: >> >>> > > > >> >>> > > >> Great! I'll volunteer to handle the conda-forge feedstock >> updates. >> >>> > > >> >> >>> > > >> On Mon, Mar 19, 2018 at 6:09 PM Wes McKinney < >> wesmck...@gmail.com> >> >>> > > wrote: >> >>> > > >> >> >>> > > >> > With 4 binding +1 votes, 2 non-binding +1, and no other >> votes, the >> >>> > > >> > vote passes. Thanks all! >> >>> > > >> > >> >>> > > >> > Let's get busy updating the C++, Python, Java packages and >> >>> updating >> >>> > > >> > the website. I will be able to draft a 0.9.0 blog post for the >> >>> website >> >>> > > >> > later today or tomorrow morning. I suggest we announce the >> >>> release on >> >>> > > >> > Wednesday morning after we have a chance to move along the >> binary >> >>> > > >> > packaging process. >> >>> > > >> > >> >>> > > >> > Thanks >> >>> > > >> > Wes >> >>> > > >> > >> >>> > > >> > On Mon, Mar 19, 2018 at 2:47 PM, Phillip Cloud < >> cpcl...@gmail.com >> >>> > >> >>> > > >> wrote: >> >>> > > >> > > Just verified on windows, all systems are go for launch. >> >>> > > >> > > >> >>> > > >> > > On Mon, Mar 19, 2018 at 12:51 PM Li Jin < >> ice.xell...@gmail.com> >> >>> > > wrote: >> >>> > > >> > > >> >>> > > >> > >> +1 >
[jira] [Created] (ARROW-2273) Cannot deserialize pandas SparseDataFrame
Mitar created ARROW-2273: Summary: Cannot deserialize pandas SparseDataFrame Key: ARROW-2273 URL: https://issues.apache.org/jira/browse/ARROW-2273 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.9.0 Reporter: Mitar >>> import pyarrow >>> import pandas >>> a = pandas.SparseDataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}) >>> pyarrow.deserialize(pyarrow.serialize(a).to_buffer()) Traceback (most recent call last): File "", line 1, in File "serialization.pxi", line 441, in pyarrow.lib.deserialize File "serialization.pxi", line 404, in pyarrow.lib.deserialize_from File "serialization.pxi", line 257, in pyarrow.lib.SerializedPyObject.deserialize File "serialization.pxi", line 174, in pyarrow.lib.SerializationContext._deserialize_callback File ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/serialization.py", line 77, in _deserialize_pandas_dataframe return pdcompat.serialized_dict_to_dataframe(data) File ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", line 450, in serialized_dict_to_dataframe for block in data['blocks']] File ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", line 450, in for block in data['blocks']] File ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", line 478, in _reconstruct_block block = _int.make_block(block_arr, placement=placement) File ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", line 2957, in make_block return klass(values, ndim=ndim, fastpath=fastpath, placement=placement) File ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", line 120, in __init__ len(self.mgr_locs))) ValueError: Wrong number of items passed 3, placement implies 1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-2269) Cannot build bdist_wheel for Python
Mitar created ARROW-2269: Summary: Cannot build bdist_wheel for Python Key: ARROW-2269 URL: https://issues.apache.org/jira/browse/ARROW-2269 Project: Apache Arrow Issue Type: Bug Components: Packaging Affects Versions: 0.9.0 Reporter: Mitar I am trying current master. I ran: {{python setup.py build_ext --build-type=$ARROW_BUILD_TYPE --with-parquet --with-plasma --bundle-arrow-cpp bdist_wheel }} Output: {{running build_ext creating build creating build/temp.linux-x86_64-3.6 -- Runnning cmake for pyarrow cmake -DPYTHON_EXECUTABLE=.../Temp/arrow/pyarrow/bin/python -DPYARROW_BUILD_PARQUET=on -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_BUILD_PLASMA=on -DPYARROW_BUNDLE_ARROW_CPP=ON -DCMAKE_BUILD_TYPE=release .../Temp/arrow/arrow/python -- The C compiler identification is GNU 7.2.0 -- The CXX compiler identification is GNU 7.2.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done INFOCompiler command: /usr/bin/c++ INFOCompiler version: Using built-in specs. COLLECT_GCC=/usr/bin/c++ COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.2.0-8ubuntu3.2' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 7.2.0 (Ubuntu 7.2.0-8ubuntu3.2) INFOCompiler id: GNU Selected compiler gcc 7.2.0 -- Performing Test CXX_SUPPORTS_SSE3 -- Performing Test CXX_SUPPORTS_SSE3 - Success -- Performing Test CXX_SUPPORTS_ALTIVEC -- Performing Test CXX_SUPPORTS_ALTIVEC - Failed Configured for RELEASE build (set with cmake -DCMAKE_BUILD_TYPE=\{release,debug,...}) -- Build Type: RELEASE -- Build output directory: .../Temp/arrow/arrow/python/build/temp.linux-x86_64-3.6/release/ -- Found PythonInterp: .../Temp/arrow/pyarrow/bin/python (found version "3.6.3") -- Searching for Python libs in .../Temp/arrow/pyarrow/lib64;.../Temp/arrow/pyarrow/lib;/usr/lib/python3.6/config-3.6m-x86_64-linux-gnu -- Looking for python3.6m -- Found Python lib /usr/lib/python3.6/config-3.6m-x86_64-linux-gnu/libpython3.6m.so -- Found PythonLibs: /usr/lib/python3.6/config-3.6m-x86_64-linux-gnu/libpython3.6m.so -- Found NumPy: version "1.14.1" .../Temp/arrow/pyarrow/lib/python3.6/site-packages/numpy/core/include -- Searching for Python libs in .../Temp/arrow/pyarrow/lib64;.../Temp/arrow/pyarrow/lib;/usr/lib/python3.6/config-3.6m-x86_64-linux-gnu -- Looking for python3.6m -- Found Python lib /usr/lib/python3.6/config-3.6m-x86_64-linux-gnu/libpython3.6m.so -- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1") -- Checking for module 'arrow' -- Found arrow, version 0.9.0-SNAPSHOT -- Arrow ABI version: 0.0.0 -- Arrow SO version: 0 -- Found the Arrow core library: .../Temp/arrow/dist/lib/libarrow.so -- Found the Arrow Python library: .../Temp/arrow/dist/lib/libarrow_python.so -- Boost version: 1.63.0 -- Found the following Boost libraries: -- system -- filesystem -- regex Added shared library dependency arrow: .../Temp/arrow/dist/lib/libarrow.so Added shared library dependency arrow_python: .../Temp/arrow/dist/lib/libarrow_python.so -- Found the Parquet library: .../Temp/arrow/dist/lib/libparquet.so Added shared library dependency parquet: .../Temp/arrow/dist/lib/libparquet.so -- Checking for module 'plasma' -- Found plasma, version -- Plasma ABI version: 0.0.0 -- Plasma SO version: 0 -- Found the Plasma core library: .../Temp/arrow/
[jira] [Created] (ARROW-2264) Efficiently serialize numpy arrays with dtype of unicode fixed length string
Mitar created ARROW-2264: Summary: Efficiently serialize numpy arrays with dtype of unicode fixed length string Key: ARROW-2264 URL: https://issues.apache.org/jira/browse/ARROW-2264 Project: Apache Arrow Issue Type: Improvement Affects Versions: 0.8.0 Reporter: Mitar Looking at the numpy array serialization code it seems that if I have a dtype like "<U3" this will go through custom ndarray serializer and not through an efficient one. {{Example:}}{{>>> np.array(['aaa', 'bbb'])}} {{array(['aaa', 'bbb'], dtype='<U3')}} This should be able to work, no? It has fixed offsets and memory layout. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
How to properly serialize subclasses of supported classes
Hi! I have a subclass of numpy and another of pandas which add a metadata attribute to them. Moreover, I have a subclass of typing.List as a Python generic with this metadata attribute as well. Now, it seems if I serialize this to plasma store and back I get standard numpy, pandas, or list back, respectively. My question is: how can I make it so that proper subclasses are returned, including the custom metadata attribute? I tried to use pyarrow_lib._default_serialization_context.register_type but it does not seem to work. Moreover, I still worry that even if I create a serialization for a custom class, if anyone makes a subclass and tries to store it plasma store they will get back the custom class and not a subclass. This is how I am testing: https://gitlab.com/datadrivendiscovery/metadata/blob/plasma/tests/test_plasma.py#L50 And here is the code for custom numpy class and attempt at registering custom serialization: https://gitlab.com/datadrivendiscovery/metadata/blob/plasma/d3m_metadata/container/numpy.py#L135 It looks like custom serialization is not called. Mitar -- http://mitar.tnode.com/ https://twitter.com/mitar_m
[jira] [Created] (ARROW-2250) plasma_store process should cleanup on TERM signal as well
Mitar created ARROW-2250: Summary: plasma_store process should cleanup on TERM signal as well Key: ARROW-2250 URL: https://issues.apache.org/jira/browse/ARROW-2250 Project: Apache Arrow Issue Type: Improvement Affects Versions: 0.8.0 Reporter: Mitar Currently it cleans up on INT signal. But if it gets the TERM signal, then it kills the parent process (Python one) but not the binary process. I think both TERM and INT signals should be handled the same. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-1664) Support for xarray.DataArray and xarray.Dataset
Mitar created ARROW-1664: Summary: Support for xarray.DataArray and xarray.Dataset Key: ARROW-1664 URL: https://issues.apache.org/jira/browse/ARROW-1664 Project: Apache Arrow Issue Type: Bug Reporter: Mitar DataArray and Dataset are efficient in-memory representations for multi dimensional data. It would be great if one could share them between processes using Arrow. http://xarray.pydata.org/en/stable/generated/xarray.DataArray.html#xarray.DataArray http://xarray.pydata.org/en/stable/generated/xarray.Dataset.html#xarray.Dataset -- This message was sent by Atlassian JIRA (v6.4.14#64029)