[jira] [Commented] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests
[ https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583408#comment-16583408 ] Wes McKinney commented on ARROW-1380: - The valgrind output for the Python unit tests where this occurs is being suppressed see https://github.com/apache/arrow/pull/1883 > [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests > - > > Key: ARROW-1380 > URL: https://issues.apache.org/jira/browse/ARROW-1380 > Project: Apache Arrow > Issue Type: Bug > Components: Plasma (C++) >Reporter: Wes McKinney >Priority: Major > Fix For: 0.11.0 > > Attachments: LastTest.log > > > I thought I fixed this, but they seem to have recurred: > https://travis-ci.org/apache/arrow/jobs/266421430#L5220 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-2797) [JS] comparison predicates don't work on 64-bit integers
[ https://issues.apache.org/jira/browse/ARROW-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette reassigned ARROW-2797: Assignee: Brian Hulette > [JS] comparison predicates don't work on 64-bit integers > > > Key: ARROW-2797 > URL: https://issues.apache.org/jira/browse/ARROW-2797 > Project: Apache Arrow > Issue Type: Bug > Components: JavaScript >Affects Versions: JS-0.3.1 >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > Fix For: JS-0.5.0 > > > The 64-bit integer vector {{get}} function returns a 2-element array, which > doesn't compare propery in the comparison predicates. We should special case > the comparisons for 64-bit integers and timestamps. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2797) [JS] comparison predicates don't work on 64-bit integers
[ https://issues.apache.org/jira/browse/ARROW-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-2797: - Fix Version/s: JS-0.5.0 > [JS] comparison predicates don't work on 64-bit integers > > > Key: ARROW-2797 > URL: https://issues.apache.org/jira/browse/ARROW-2797 > Project: Apache Arrow > Issue Type: Bug > Components: JavaScript >Affects Versions: JS-0.3.1 >Reporter: Brian Hulette >Priority: Major > Fix For: JS-0.5.0 > > > The 64-bit integer vector {{get}} function returns a 2-element array, which > doesn't compare propery in the comparison predicates. We should special case > the comparisons for 64-bit integers and timestamps. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2235) [JS] Add tests for IPC messages split across multiple buffers
[ https://issues.apache.org/jira/browse/ARROW-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-2235: - Fix Version/s: JS-0.5.0 > [JS] Add tests for IPC messages split across multiple buffers > - > > Key: ARROW-2235 > URL: https://issues.apache.org/jira/browse/ARROW-2235 > Project: Apache Arrow > Issue Type: Task > Components: JavaScript >Reporter: Brian Hulette >Priority: Major > Fix For: JS-0.5.0 > > > See https://github.com/apache/arrow/pull/1670 > This is probably easiest to do after the JS IPC writer is finished > (ARROW-2116) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2410) [JS] Add DataFrame.scanAsync
[ https://issues.apache.org/jira/browse/ARROW-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-2410: - Fix Version/s: JS-0.5.0 > [JS] Add DataFrame.scanAsync > > > Key: ARROW-2410 > URL: https://issues.apache.org/jira/browse/ARROW-2410 > Project: Apache Arrow > Issue Type: Improvement > Components: JavaScript >Reporter: Brian Hulette >Priority: Major > Fix For: JS-0.5.0 > > > Add a version of `DataFrame.scan`, `scanAsync` that yields periodically. The > yield frequency could be specified either as a number of record batches, or a > number of records. > This scan should also be cancellable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-1700) [JS] Implement Node.js client for Plasma store
[ https://issues.apache.org/jira/browse/ARROW-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-1700: - Fix Version/s: JS-0.5.0 > [JS] Implement Node.js client for Plasma store > -- > > Key: ARROW-1700 > URL: https://issues.apache.org/jira/browse/ARROW-1700 > Project: Apache Arrow > Issue Type: New Feature > Components: JavaScript, Plasma (C++) >Reporter: Robert Nishihara >Priority: Major > Fix For: JS-0.5.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-1700) [JS] Implement Node.js client for Plasma store
[ https://issues.apache.org/jira/browse/ARROW-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-1700: - Summary: [JS] Implement Node.js client for Plasma store (was: Implement Node.js client for Plasma store) > [JS] Implement Node.js client for Plasma store > -- > > Key: ARROW-1700 > URL: https://issues.apache.org/jira/browse/ARROW-1700 > Project: Apache Arrow > Issue Type: New Feature > Components: JavaScript, Plasma (C++) >Reporter: Robert Nishihara >Priority: Major > Fix For: JS-0.5.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2766) [JS] Add ability to construct a Table from a list of Arrays/TypedArrays
[ https://issues.apache.org/jira/browse/ARROW-2766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-2766: - Fix Version/s: JS-0.5.0 > [JS] Add ability to construct a Table from a list of Arrays/TypedArrays > --- > > Key: ARROW-2766 > URL: https://issues.apache.org/jira/browse/ARROW-2766 > Project: Apache Arrow > Issue Type: New Feature > Components: JavaScript >Reporter: Brian Hulette >Priority: Major > Fix For: JS-0.5.0 > > > Something like > {code:javascript} > Table.from({'col1': [...], 'col2': [...], 'col3': [...]}) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-1744) [Plasma] Provide TensorFlow operator to read tensors from plasma
[ https://issues.apache.org/jira/browse/ARROW-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583295#comment-16583295 ] Brian Hulette commented on ARROW-1744: -- I think the same thing happened in ARROW-2940, ARROW-2451, ARROW-2437, ARROW-2458, and ARROW-2397 - I went ahead and updated them all. It looks like these mistakes prevented them from being added to CHANGELOG.md for v0.10.0 > [Plasma] Provide TensorFlow operator to read tensors from plasma > > > Key: ARROW-1744 > URL: https://issues.apache.org/jira/browse/ARROW-1744 > Project: Apache Arrow > Issue Type: Improvement > Components: Plasma (C++) >Reporter: Philipp Moritz >Assignee: Philipp Moritz >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > Time Spent: 17h 10m > Remaining Estimate: 0h > > see https://www.tensorflow.org/extend/adding_an_op -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-1744) [Plasma] Provide TensorFlow operator to read tensors from plasma
[ https://issues.apache.org/jira/browse/ARROW-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-1744: - Fix Version/s: (was: JS-0.4.0) 0.10.0 > [Plasma] Provide TensorFlow operator to read tensors from plasma > > > Key: ARROW-1744 > URL: https://issues.apache.org/jira/browse/ARROW-1744 > Project: Apache Arrow > Issue Type: Improvement > Components: Plasma (C++) >Reporter: Philipp Moritz >Assignee: Philipp Moritz >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > Time Spent: 17h 10m > Remaining Estimate: 0h > > see https://www.tensorflow.org/extend/adding_an_op -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2397) Document changes in Tensor encoding in IPC.md.
[ https://issues.apache.org/jira/browse/ARROW-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-2397: - Fix Version/s: (was: JS-0.4.0) 0.10.0 > Document changes in Tensor encoding in IPC.md. > -- > > Key: ARROW-2397 > URL: https://issues.apache.org/jira/browse/ARROW-2397 > Project: Apache Arrow > Issue Type: Improvement > Components: Documentation >Reporter: Robert Nishihara >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > Update IPC.md to reflect the changes in > https://github.com/apache/arrow/pull/1802. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2458) [Plasma] PlasmaClient uses global variable
[ https://issues.apache.org/jira/browse/ARROW-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-2458: - Fix Version/s: (was: JS-0.4.0) 0.10.0 > [Plasma] PlasmaClient uses global variable > -- > > Key: ARROW-2458 > URL: https://issues.apache.org/jira/browse/ARROW-2458 > Project: Apache Arrow > Issue Type: Improvement > Components: Plasma (C++) >Affects Versions: 0.9.0 >Reporter: Philipp Moritz >Assignee: Philipp Moritz >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > The threadpool threadpool_ that PlasmaClient is using is global at the > moment. This prevents us from using multiple PlasmaClients in the same > process (one per thread). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2451) Handle more dtypes efficiently in custom numpy array serializer.
[ https://issues.apache.org/jira/browse/ARROW-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-2451: - Fix Version/s: (was: JS-0.4.0) 0.10.0 > Handle more dtypes efficiently in custom numpy array serializer. > > > Key: ARROW-2451 > URL: https://issues.apache.org/jira/browse/ARROW-2451 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Robert Nishihara >Assignee: Robert Nishihara >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > Right now certain dtypes like bool or fixed length strings are serialized as > lists, which is inefficient. We can handle these more efficiently by casting > them to uint8 and saving the original dtype as additional data. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2437) [C++] Change of arrow::ipc::ReadMessage signature breaks ABI compability
[ https://issues.apache.org/jira/browse/ARROW-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-2437: - Fix Version/s: (was: JS-0.4.0) 0.10.0 > [C++] Change of arrow::ipc::ReadMessage signature breaks ABI compability > > > Key: ARROW-2437 > URL: https://issues.apache.org/jira/browse/ARROW-2437 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Uwe L. Korn >Assignee: Robert Nishihara >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > We changed the signature of the method from > {code} > ReadMessage ( arrow::io::InputStream* file, std::unique_ptr std::default_delete >* message ) > {code} > to > {code} > ReadMessage ( arrow::io::InputStream* file, std::unique_ptr std::default_delete >* message, bool aligned ) > {code} > We should add the old signature so that the 0.9.1 release is ABI compatible > to 0.9.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2940) [Python] Import error with pytorch 0.3
[ https://issues.apache.org/jira/browse/ARROW-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-2940: - Fix Version/s: (was: JS-0.4.0) 0.10.0 > [Python] Import error with pytorch 0.3 > -- > > Key: ARROW-2940 > URL: https://issues.apache.org/jira/browse/ARROW-2940 > Project: Apache Arrow > Issue Type: Bug >Reporter: Philipp Moritz >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > Time Spent: 50m > Remaining Estimate: 0h > > The fix in ARROW-2920 doesn't work in versions strictly before pytorch 0.4: > {code:java} > >>> import pyarrow > Traceback (most recent call last): > File "", line 1, in > File "/home/ubuntu/arrow/python/pyarrow/__init__.py", line 57, in > compat.import_pytorch_extension() > File "/home/ubuntu/arrow/python/pyarrow/compat.py", line 249, in > import_pytorch_extension > ctypes.CDLL(os.path.join(path, "lib/libcaffe2.so")) > File > "/home/ubuntu/anaconda3/envs/breaking-env2/lib/python3.5/ctypes/__init__.py", > line 351, in __init__ > self._handle = _dlopen(self._name, mode) > OSError: > /home/ubuntu/anaconda3/envs/breaking-env2/lib/python3.5/site-packages/torch/lib/libcaffe2.so: > cannot open shared object file: No such file or directory{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2975) [Plasma] TensorFlow op: Compilation only working if arrow found by pkg-config
[ https://issues.apache.org/jira/browse/ARROW-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-2975: - Fix Version/s: (was: JS-0.4.0) 0.11.0 > [Plasma] TensorFlow op: Compilation only working if arrow found by pkg-config > - > > Key: ARROW-2975 > URL: https://issues.apache.org/jira/browse/ARROW-2975 > Project: Apache Arrow > Issue Type: Improvement > Components: Plasma (C++) >Reporter: Philipp Moritz >Assignee: Philipp Moritz >Priority: Major > Labels: pull-request-available > Fix For: 0.11.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Currently the pyarrow/tensorflow/build.sh script uses pyarrow to discover the > arrow libraries to link against. However, this is not working on the pip > package of pyarrow (since the .pc files are not shipped with it). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-1744) [Plasma] Provide TensorFlow operator to read tensors from plasma
[ https://issues.apache.org/jira/browse/ARROW-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583285#comment-16583285 ] Brian Hulette commented on ARROW-1744: -- It looks like this was actually merged for 0.10 (and certainly not JS-0.4.0) - is it too late to update the fix version? > [Plasma] Provide TensorFlow operator to read tensors from plasma > > > Key: ARROW-1744 > URL: https://issues.apache.org/jira/browse/ARROW-1744 > Project: Apache Arrow > Issue Type: Improvement > Components: Plasma (C++) >Reporter: Philipp Moritz >Assignee: Philipp Moritz >Priority: Major > Labels: pull-request-available > Fix For: JS-0.4.0 > > Time Spent: 17h 10m > Remaining Estimate: 0h > > see https://www.tensorflow.org/extend/adding_an_op -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2764) [JS] Easy way to create a new Table with an additional column
[ https://issues.apache.org/jira/browse/ARROW-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-2764: - Fix Version/s: (was: JS-0.4.0) JS-0.5.0 > [JS] Easy way to create a new Table with an additional column > - > > Key: ARROW-2764 > URL: https://issues.apache.org/jira/browse/ARROW-2764 > Project: Apache Arrow > Issue Type: Improvement > Components: JavaScript >Reporter: Brian Hulette >Priority: Major > Fix For: JS-0.5.0 > > > It should be easier to add a new column to a table. API could be either > `table.addColumn(vector)` or `table.merge(..tables or vectors)` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-2819) [JS] Fails to build with TS 2.8.3
[ https://issues.apache.org/jira/browse/ARROW-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette resolved ARROW-2819. -- Resolution: Fixed Fixed in [#2201](https://github.com/apache/arrow/pull/2201) > [JS] Fails to build with TS 2.8.3 > - > > Key: ARROW-2819 > URL: https://issues.apache.org/jira/browse/ARROW-2819 > Project: Apache Arrow > Issue Type: Bug > Components: JavaScript >Affects Versions: JS-0.3.1 >Reporter: Brian Hulette >Assignee: Paul Taylor >Priority: Major > Fix For: JS-0.4.0 > > > See the [GitHub > issue|https://github.com/apache/arrow/issues/2115#issuecomment-403612925] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2828) [JS] Refactor Vector Data classes
[ https://issues.apache.org/jira/browse/ARROW-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated ARROW-2828: - Fix Version/s: (was: JS-0.4.0) JS-0.5.0 > [JS] Refactor Vector Data classes > - > > Key: ARROW-2828 > URL: https://issues.apache.org/jira/browse/ARROW-2828 > Project: Apache Arrow > Issue Type: Task > Components: JavaScript >Reporter: Paul Taylor >Assignee: Paul Taylor >Priority: Major > Fix For: JS-0.5.0 > > > In order to make it easier to build some of the higher-level APIs, we need to > slim the Vector Data classes down to just one base implementation. > Initial WIP commit here, and work will continue in this branch: > https://github.com/trxcllnt/arrow/commit/dfad9023583bef4f8d2a50ea25f643e4bccbc805#diff-2512057432c4ebf55c6308cb06b43b08 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests
[ https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583276#comment-16583276 ] Lukasz Bartnik edited comment on ARROW-1380 at 8/17/18 2:22 AM: I took a quick look at a recent build ([https://travis-ci.org/apache/arrow/builds/417014924).] Neither of its C++ jobs ([https://travis-ci.org/apache/arrow/jobs/417014931] and [https://travis-ci.org/apache/arrow/jobs/417014927)] seem to use valgrind. The only job that seems to use valgrind is the openjdk8/gcc one ([https://travis-ci.org/apache/arrow/jobs/417014925)] but there are no reports from valgrind in the log; in fact, valgrind doesn't seem to be used there at all. Looking at job descriptions: the original job where "still reachable" blocks are reported was a "gcc C++" one, but there were two such jobs back then (3786.1 and 3786.8) whereas there's only one now (9492.7). It seem that the error has been fixed between builds 3786 and 9492. I'm attaching the LastTest.log which does not contain any valgrind alarms: every "HEAP SUMMARY" line is followed by a "in use at exit: 0 bytes in 0 blocks" line. was (Author: lbartnik): I took a quick look at a recent build ([https://travis-ci.org/apache/arrow/builds/417014924).] Neither of its C++ jobs ([https://travis-ci.org/apache/arrow/jobs/417014931] and [https://travis-ci.org/apache/arrow/jobs/417014927)] seem to use valgrind. The only job that seems to use valgrind is the openjdk8/gcc one ([https://travis-ci.org/apache/arrow/jobs/417014925)] but there are no reports from valgrind in the log; in fact, valgrind doesn't seem to be used there at all. Looking at job descriptions: the original job where "still reachable" blocks are reported was a "gcc C++" one, but there were two such jobs back then (3786.1 and 3786.8) whereas there's only one now (9492.7). It seem that the error has been fixed between builds 3786 and 9492. I'm attaching the LastTest.log which does not contain any valgrind alarms: every "HEAP SUMMARY" line is followed by a "in use at exit: 0 bytes in 0 blocks" line. > [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests > - > > Key: ARROW-1380 > URL: https://issues.apache.org/jira/browse/ARROW-1380 > Project: Apache Arrow > Issue Type: Bug > Components: Plasma (C++) >Reporter: Wes McKinney >Priority: Major > Fix For: 0.11.0 > > Attachments: LastTest.log > > > I thought I fixed this, but they seem to have recurred: > https://travis-ci.org/apache/arrow/jobs/266421430#L5220 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests
[ https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukasz Bartnik updated ARROW-1380: -- Attachment: LastTest.log > [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests > - > > Key: ARROW-1380 > URL: https://issues.apache.org/jira/browse/ARROW-1380 > Project: Apache Arrow > Issue Type: Bug > Components: Plasma (C++) >Reporter: Wes McKinney >Priority: Major > Fix For: 0.11.0 > > Attachments: LastTest.log > > > I thought I fixed this, but they seem to have recurred: > https://travis-ci.org/apache/arrow/jobs/266421430#L5220 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests
[ https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583276#comment-16583276 ] Lukasz Bartnik commented on ARROW-1380: --- I took a quick look at a recent build ([https://travis-ci.org/apache/arrow/builds/417014924).] Neither of its C++ jobs ([https://travis-ci.org/apache/arrow/jobs/417014931] and [https://travis-ci.org/apache/arrow/jobs/417014927)] seem to use valgrind. The only job that seems to use valgrind is the openjdk8/gcc one ([https://travis-ci.org/apache/arrow/jobs/417014925)] but there are no reports from valgrind in the log; in fact, valgrind doesn't seem to be used there at all. Looking at job descriptions: the original job where "still reachable" blocks are reported was a "gcc C++" one, but there were two such jobs back then (3786.1 and 3786.8) whereas there's only one now (9492.7). It seem that the error has been fixed between builds 3786 and 9492. I'm attaching the LastTest.log which does not contain any valgrind alarms: every "HEAP SUMMARY" line is followed by a "in use at exit: 0 bytes in 0 blocks" line. > [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests > - > > Key: ARROW-1380 > URL: https://issues.apache.org/jira/browse/ARROW-1380 > Project: Apache Arrow > Issue Type: Bug > Components: Plasma (C++) >Reporter: Wes McKinney >Priority: Major > Fix For: 0.11.0 > > > I thought I fixed this, but they seem to have recurred: > https://travis-ci.org/apache/arrow/jobs/266421430#L5220 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3065) concat_tables() failing from bad Pandas Metadata
[ https://issues.apache.org/jira/browse/ARROW-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-3065: Fix Version/s: (was: 0.9.0) 0.11.0 > concat_tables() failing from bad Pandas Metadata > > > Key: ARROW-3065 > URL: https://issues.apache.org/jira/browse/ARROW-3065 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.10.0 >Reporter: David Lee >Priority: Major > Fix For: 0.11.0 > > > Looks like the major bug from > https://issues.apache.org/jira/browse/ARROW-1941 is back... > After I downgraded from 0.10.0 to 0.9.0, the error disappeared.. > {code:python} > new_arrow_table = pa.concat_tables(my_arrow_tables) > File "pyarrow/table.pxi", line 1562, in pyarrow.lib.concat_tables > File "pyarrow/error.pxi", line 81, in pyarrow.lib.check_status > pyarrow.lib.ArrowInvalid: Schema at index 2 was different: > {code} > In order to debug this I saved the first 4 arrow tables to 4 parquet files > and inspected the parquet files. The parquet schema is identical, but the > Pandas Metadata is different. > {code:python} > for i in range(5): > pq.write_table(my_arrow_tables[i], "test" + str(i) + ".parquet") > {code} > It looks like a column which contains empty strings is getting typed as > float64. > {code:python} > >>> test1.schema > HoldingDetail_Id: string > metadata > > {b'pandas': b'{"index_columns": [], "column_indexes": [], "columns": [ > {"name": "HoldingDetail_Id", "field_name": "HoldingDetail_Id", "pandas_type": > "unicode", "numpy_type": "object", "metadata": null}, > >>> test1[0] > > [ > [ > "Z4", > "SF", > "J7", > "W6", > "L7", > "Q9", > "NE", > "F7", > >>> test2.schema > HoldingDetail_Id: string > metadata > > {b'pandas': b'{"index_columns": [], "column_indexes": [], "columns": [ > {"name": "HoldingDetail_Id", "field_name": "HoldingDetail_Id", "pandas_type": > "unicode", "numpy_type": "float64", "metadata": null}, > >>> test2[0] > > [ > [ > "", > "", > "", > "", > "", > "", > "", > "", > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3065) concat_tables() failing from bad Pandas Metadata
David Lee created ARROW-3065: Summary: concat_tables() failing from bad Pandas Metadata Key: ARROW-3065 URL: https://issues.apache.org/jira/browse/ARROW-3065 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.10.0 Reporter: David Lee Fix For: 0.9.0 Looks like the major bug from https://issues.apache.org/jira/browse/ARROW-1941 is back... After I downgraded from 0.10.0 to 0.9.0, the error disappeared.. {code:python} new_arrow_table = pa.concat_tables(my_arrow_tables) File "pyarrow/table.pxi", line 1562, in pyarrow.lib.concat_tables File "pyarrow/error.pxi", line 81, in pyarrow.lib.check_status pyarrow.lib.ArrowInvalid: Schema at index 2 was different: {code} In order to debug this I saved the first 4 arrow tables to 4 parquet files and inspected the parquet files. The parquet schema is identical, but the Pandas Metadata is different. {code:python} for i in range(5): pq.write_table(my_arrow_tables[i], "test" + str(i) + ".parquet") {code} It looks like a column which contains empty strings is getting typed as float64. {code:python} >>> test1.schema HoldingDetail_Id: string metadata {b'pandas': b'{"index_columns": [], "column_indexes": [], "columns": [ {"name": "HoldingDetail_Id", "field_name": "HoldingDetail_Id", "pandas_type": "unicode", "numpy_type": "object", "metadata": null}, >>> test1[0] [ [ "Z4", "SF", "J7", "W6", "L7", "Q9", "NE", "F7", >>> test2.schema HoldingDetail_Id: string metadata {b'pandas': b'{"index_columns": [], "column_indexes": [], "columns": [ {"name": "HoldingDetail_Id", "field_name": "HoldingDetail_Id", "pandas_type": "unicode", "numpy_type": "float64", "metadata": null}, >>> test2[0] [ [ "", "", "", "", "", "", "", "", {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-1799) [Plasma C++] Make unittest does not create plasma store executable
[ https://issues.apache.org/jira/browse/ARROW-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-1799. - Resolution: Fixed Issue resolved by pull request 2440 [https://github.com/apache/arrow/pull/2440] > [Plasma C++] Make unittest does not create plasma store executable > -- > > Key: ARROW-1799 > URL: https://issues.apache.org/jira/browse/ARROW-1799 > Project: Apache Arrow > Issue Type: Bug > Components: Plasma (C++) >Reporter: William Paul >Assignee: Lukasz Bartnik >Priority: Minor > Labels: pull-request-available > Fix For: 0.11.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Steps to reproduce from a fresh clone of Arrow: > mkdir cpp/debug > cd cpp/debug > cmake .. -DARROW_PLASMA=on > make -j8 unittest > client_tests may then fail due to the store executable not being created. The > first time I reproduced the issue the test did fail, but the test passed on > subsequent reproductions of this issue. Regardless, if you look in > cpp/debug/debug, there is no plasma store executable. If you then call make, > the store executable is generated in that directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-1799) [Plasma C++] Make unittest does not create plasma store executable
[ https://issues.apache.org/jira/browse/ARROW-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-1799: --- Assignee: Lukasz Bartnik > [Plasma C++] Make unittest does not create plasma store executable > -- > > Key: ARROW-1799 > URL: https://issues.apache.org/jira/browse/ARROW-1799 > Project: Apache Arrow > Issue Type: Bug > Components: Plasma (C++) >Reporter: William Paul >Assignee: Lukasz Bartnik >Priority: Minor > Labels: pull-request-available > Fix For: 0.11.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Steps to reproduce from a fresh clone of Arrow: > mkdir cpp/debug > cd cpp/debug > cmake .. -DARROW_PLASMA=on > make -j8 unittest > client_tests may then fail due to the store executable not being created. The > first time I reproduced the issue the test did fail, but the test passed on > subsequent reproductions of this issue. Regardless, if you look in > cpp/debug/debug, there is no plasma store executable. If you then call make, > the store executable is generated in that directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3064) [C++] Add option to ADD_ARROW_TEST to indicate additional dependencies for particular unit test executables
[ https://issues.apache.org/jira/browse/ARROW-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-3064: -- Labels: pull-request-available (was: ) > [C++] Add option to ADD_ARROW_TEST to indicate additional dependencies for > particular unit test executables > --- > > Key: ARROW-3064 > URL: https://issues.apache.org/jira/browse/ARROW-3064 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Fix For: 0.11.0 > > > See ARROW-1799 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-3064) [C++] Add option to ADD_ARROW_TEST to indicate additional dependencies for particular unit test executables
[ https://issues.apache.org/jira/browse/ARROW-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-3064: --- Assignee: Lukasz Bartnik > [C++] Add option to ADD_ARROW_TEST to indicate additional dependencies for > particular unit test executables > --- > > Key: ARROW-3064 > URL: https://issues.apache.org/jira/browse/ARROW-3064 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Assignee: Lukasz Bartnik >Priority: Major > Labels: pull-request-available > Fix For: 0.11.0 > > Time Spent: 10m > Remaining Estimate: 0h > > See ARROW-1799 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-3064) [C++] Add option to ADD_ARROW_TEST to indicate additional dependencies for particular unit test executables
[ https://issues.apache.org/jira/browse/ARROW-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-3064. - Resolution: Fixed Issue resolved by pull request 2439 [https://github.com/apache/arrow/pull/2439] > [C++] Add option to ADD_ARROW_TEST to indicate additional dependencies for > particular unit test executables > --- > > Key: ARROW-3064 > URL: https://issues.apache.org/jira/browse/ARROW-3064 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Fix For: 0.11.0 > > > See ARROW-1799 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-1799) [Plasma C++] Make unittest does not create plasma store executable
[ https://issues.apache.org/jira/browse/ARROW-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-1799: -- Labels: pull-request-available (was: ) > [Plasma C++] Make unittest does not create plasma store executable > -- > > Key: ARROW-1799 > URL: https://issues.apache.org/jira/browse/ARROW-1799 > Project: Apache Arrow > Issue Type: Bug > Components: Plasma (C++) >Reporter: William Paul >Priority: Minor > Labels: pull-request-available > Fix For: 0.11.0 > > > Steps to reproduce from a fresh clone of Arrow: > mkdir cpp/debug > cd cpp/debug > cmake .. -DARROW_PLASMA=on > make -j8 unittest > client_tests may then fail due to the store executable not being created. The > first time I reproduced the issue the test did fail, but the test passed on > subsequent reproductions of this issue. Regardless, if you look in > cpp/debug/debug, there is no plasma store executable. If you then call make, > the store executable is generated in that directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3064) [C++] Add option to ADD_ARROW_TEST to indicate additional dependencies for particular unit test executables
Wes McKinney created ARROW-3064: --- Summary: [C++] Add option to ADD_ARROW_TEST to indicate additional dependencies for particular unit test executables Key: ARROW-3064 URL: https://issues.apache.org/jira/browse/ARROW-3064 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Wes McKinney Fix For: 0.11.0 See ARROW-1799 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ARROW-1799) [Plasma C++] Make unittest does not create plasma store executable
[ https://issues.apache.org/jira/browse/ARROW-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582778#comment-16582778 ] Wes McKinney edited comment on ARROW-1799 at 8/16/18 4:25 PM: -- I missed this issue when it was reported. This can be resolved by adding dependencies to the tests on the plasma_store executable at https://github.com/apache/arrow/blob/master/cpp/src/plasma/CMakeLists.txt#L199 we should probably add a {{DEPENDENCIES}} arg to {{ADD_ARROW_TEST}} to make this simpler was (Author: wesmckinn): I missed this issue when it was reported. This can be resolved by adding dependencies to the tests on the plasma_store executable at https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/file.cc#L355 we should probably add a {{DEPENDENCIES}} arg to {{ADD_ARROW_TEST}} to make this simpler > [Plasma C++] Make unittest does not create plasma store executable > -- > > Key: ARROW-1799 > URL: https://issues.apache.org/jira/browse/ARROW-1799 > Project: Apache Arrow > Issue Type: Bug > Components: Plasma (C++) >Reporter: William Paul >Priority: Minor > Fix For: 0.11.0 > > > Steps to reproduce from a fresh clone of Arrow: > mkdir cpp/debug > cd cpp/debug > cmake .. -DARROW_PLASMA=on > make -j8 unittest > client_tests may then fail due to the store executable not being created. The > first time I reproduced the issue the test did fail, but the test passed on > subsequent reproductions of this issue. Regardless, if you look in > cpp/debug/debug, there is no plasma store executable. If you then call make, > the store executable is generated in that directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-1799) [Plasma C++] Make unittest does not create plasma store executable
[ https://issues.apache.org/jira/browse/ARROW-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582778#comment-16582778 ] Wes McKinney commented on ARROW-1799: - I missed this issue when it was reported. This can be resolved by adding dependencies to the tests on the plasma_store executable at https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/file.cc#L355 we should probably add a {{DEPENDENCIES}} arg to {{ADD_ARROW_TEST}} to make this simpler > [Plasma C++] Make unittest does not create plasma store executable > -- > > Key: ARROW-1799 > URL: https://issues.apache.org/jira/browse/ARROW-1799 > Project: Apache Arrow > Issue Type: Bug > Components: Plasma (C++) >Reporter: William Paul >Priority: Minor > Fix For: 0.11.0 > > > Steps to reproduce from a fresh clone of Arrow: > mkdir cpp/debug > cd cpp/debug > cmake .. -DARROW_PLASMA=on > make -j8 unittest > client_tests may then fail due to the store executable not being created. The > first time I reproduced the issue the test did fail, but the test passed on > subsequent reproductions of this issue. Regardless, if you look in > cpp/debug/debug, there is no plasma store executable. If you then call make, > the store executable is generated in that directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-1799) [Plasma C++] Make unittest does not create plasma store executable
[ https://issues.apache.org/jira/browse/ARROW-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-1799: Fix Version/s: 0.11.0 > [Plasma C++] Make unittest does not create plasma store executable > -- > > Key: ARROW-1799 > URL: https://issues.apache.org/jira/browse/ARROW-1799 > Project: Apache Arrow > Issue Type: Bug > Components: Plasma (C++) >Reporter: William Paul >Priority: Minor > Fix For: 0.11.0 > > > Steps to reproduce from a fresh clone of Arrow: > mkdir cpp/debug > cd cpp/debug > cmake .. -DARROW_PLASMA=on > make -j8 unittest > client_tests may then fail due to the store executable not being created. The > first time I reproduced the issue the test did fail, but the test passed on > subsequent reproductions of this issue. Regardless, if you look in > cpp/debug/debug, there is no plasma store executable. If you then call make, > the store executable is generated in that directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ARROW-1799) [Plasma C++] Make unittest does not create plasma store executable
[ https://issues.apache.org/jira/browse/ARROW-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582765#comment-16582765 ] Lukasz Bartnik edited comment on ARROW-1799 at 8/16/18 4:18 PM: {{make unittest}} fails repeatedly unless {{make all}}, which creates {{libarrow.\*}} and {{libplasma.\*}} libraries, is run beforehand. Quite possibly, the {{unittest}} target needs additional dependencies. was (Author: lbartnik): {{make unittest}} fails repeatedly unless {{make all}}, which creates {{libarrow.*}} and {{libplasma.*}} libraries, is run beforehand. Quite possibly, the {{unittest}} target needs additional dependencies. > [Plasma C++] Make unittest does not create plasma store executable > -- > > Key: ARROW-1799 > URL: https://issues.apache.org/jira/browse/ARROW-1799 > Project: Apache Arrow > Issue Type: Bug > Components: Plasma (C++) >Reporter: William Paul >Priority: Minor > > Steps to reproduce from a fresh clone of Arrow: > mkdir cpp/debug > cd cpp/debug > cmake .. -DARROW_PLASMA=on > make -j8 unittest > client_tests may then fail due to the store executable not being created. The > first time I reproduced the issue the test did fail, but the test passed on > subsequent reproductions of this issue. Regardless, if you look in > cpp/debug/debug, there is no plasma store executable. If you then call make, > the store executable is generated in that directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-1799) [Plasma C++] Make unittest does not create plasma store executable
[ https://issues.apache.org/jira/browse/ARROW-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582765#comment-16582765 ] Lukasz Bartnik commented on ARROW-1799: --- {{make unittest}} fails repeatedly unless {{make all}}, which creates {{libarrow.*}} and {{libplasma.*}} libraries, is run beforehand. Quite possibly, the {{unittest}} target needs additional dependencies. > [Plasma C++] Make unittest does not create plasma store executable > -- > > Key: ARROW-1799 > URL: https://issues.apache.org/jira/browse/ARROW-1799 > Project: Apache Arrow > Issue Type: Bug > Components: Plasma (C++) >Reporter: William Paul >Priority: Minor > > Steps to reproduce from a fresh clone of Arrow: > mkdir cpp/debug > cd cpp/debug > cmake .. -DARROW_PLASMA=on > make -j8 unittest > client_tests may then fail due to the store executable not being created. The > first time I reproduced the issue the test did fail, but the test passed on > subsequent reproductions of this issue. Regardless, if you look in > cpp/debug/debug, there is no plasma store executable. If you then call make, > the store executable is generated in that directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3050) [C++] Adopt HiveServer2 client C++ codebase
[ https://issues.apache.org/jira/browse/ARROW-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582681#comment-16582681 ] Wes McKinney commented on ARROW-3050: - In progress: https://github.com/wesm/arrow/tree/hs2client-fork > [C++] Adopt HiveServer2 client C++ codebase > --- > > Key: ARROW-3050 > URL: https://issues.apache.org/jira/browse/ARROW-3050 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Wes McKinney >Priority: Major > > I helped develop a small C++/Python library for interacting with databases > like Hive and Impala via the HiveServer2 Thrift protocol and making them > accessible to Python / pandas: > https://github.com/cloudera/hs2client > Internally this interfaces with HS2's own columnar representation. Arrow is a > natural partner for this project, much of which could be discarded. I think > Arrow would make as much sense as any place to develop this codebase further. > It could be later split off into a new project if a large enough community > develops > cc [~twmarshall] [~mjacobs] for thoughts > If we did this, do we need to do a software grant (essentially what I'm > proposing is to fork)? Can we just attribute the original Cloudera authors in > LICENSE.txt? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3050) [C++] Adopt HiveServer2 client C++ codebase
[ https://issues.apache.org/jira/browse/ARROW-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582679#comment-16582679 ] Wes McKinney commented on ARROW-3050: - Makes sense. I think this supports the argument to bring hs2client and Arrow closer together. I'm not proposing to include this in Arrow's CI since the testing procedure (with Hive or Impala) is more complicated. I started a branch to do the integration work, when I have something worth looking at I will put up a PR > [C++] Adopt HiveServer2 client C++ codebase > --- > > Key: ARROW-3050 > URL: https://issues.apache.org/jira/browse/ARROW-3050 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Wes McKinney >Priority: Major > > I helped develop a small C++/Python library for interacting with databases > like Hive and Impala via the HiveServer2 Thrift protocol and making them > accessible to Python / pandas: > https://github.com/cloudera/hs2client > Internally this interfaces with HS2's own columnar representation. Arrow is a > natural partner for this project, much of which could be discarded. I think > Arrow would make as much sense as any place to develop this codebase further. > It could be later split off into a new project if a large enough community > develops > cc [~twmarshall] [~mjacobs] for thoughts > If we did this, do we need to do a software grant (essentially what I'm > proposing is to fork)? Can we just attribute the original Cloudera authors in > LICENSE.txt? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-3059) [C++] Streamline namespace array::test
[ https://issues.apache.org/jira/browse/ARROW-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-3059. - Resolution: Fixed Fix Version/s: 0.11.0 Issue resolved by pull request 2436 [https://github.com/apache/arrow/pull/2436] > [C++] Streamline namespace array::test > -- > > Key: ARROW-3059 > URL: https://issues.apache.org/jira/browse/ARROW-3059 > Project: Apache Arrow > Issue Type: Task > Components: C++ >Affects Versions: 0.10.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 0.11.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Currently we have some test helpers that live in the {{arrow::test}} > namespace, some in {{arrow}} (or topic subnamespaces such as {{arrow::io}}). > I see no reason for the discrepancy. > I propose the simple solution of removing the {{arrow::test}} namespace > altogether. If not desirable, then we should make sure we put all helpers in > that namespace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-3062) [Python] Extend fast libtensorflow_framework.so compatibility workaround to Python 2.7
[ https://issues.apache.org/jira/browse/ARROW-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-3062. - Resolution: Fixed Fix Version/s: 0.11.0 Issue resolved by pull request 2435 [https://github.com/apache/arrow/pull/2435] > [Python] Extend fast libtensorflow_framework.so compatibility workaround to > Python 2.7 > -- > > Key: ARROW-3062 > URL: https://issues.apache.org/jira/browse/ARROW-3062 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.10.0 >Reporter: Philipp Moritz >Assignee: Philipp Moritz >Priority: Major > Labels: pull-request-available > Fix For: 0.11.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The workaround ARROW-2657 should be optimized a little bit and use the > loading of libtensorflow_framework.so (instead of doing a full "import > tensorflow") also for Python 2.7. > We are running into this, since doing "import tensorflow" spawns a number of > threads, so without this optimization, using many python processes with > pyarrow will hit OS limits for threads. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3048) Import pyarrow fails if scikit-learn is installed from conda
[ https://issues.apache.org/jira/browse/ARROW-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582326#comment-16582326 ] Uwe L. Korn commented on ARROW-3048: This is because boost-cpp from defaults is installed. Please only use C++ packages only from defaults or conda-forge, don't mix the two repositorities. > Import pyarrow fails if scikit-learn is installed from conda > > > Key: ARROW-3048 > URL: https://issues.apache.org/jira/browse/ARROW-3048 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.10.0 > Environment: Ubuntu 16.04 >Reporter: Jarno Seppanen >Priority: Major > > Hi, installing both pyarrow 0.10.0 and scikit-learn 0.19.2 causes pyarrow > import to break. > Steps to reproduce > # cat >environment.yml < {code:java} > name: asdf > channels: > - defaults > - conda-forge > dependencies: > - python=3.6 > - pyarrow=0.10.0 > - scikit-learn=0.19.2{code} > EOF > # conda env create > # source activate asdf > # python -c 'import pyarrow' > {code:java} > Traceback (most recent call last): > File "", line 1, in > File > "/home/jarno/miniconda3/envs/asdf/lib/python3.6/site-packages/pyarrow/__init__.py", > line 60, in > from pyarrow.lib import cpu_count, set_cpu_count > ImportError: > /home/jarno/miniconda3/envs/asdf/lib/python3.6/site-packages/pyarrow/../../../libparquet.so.1: > undefined symbol: > _ZN5boost13match_resultsIN9__gnu_cxx17__normal_iteratorIPKcSsEESaINS_9sub_matchIS5_12maybe_assignERKS9_{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3050) [C++] Adopt HiveServer2 client C++ codebase
[ https://issues.apache.org/jira/browse/ARROW-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1658#comment-1658 ] Uwe L. Korn commented on ARROW-3050: Hive is working on Arrow support for their connectors. Once that is in state to be developed against, this could then be used in hs2client. I think before that we should try to keep them separate to not overload the Arrow build. > [C++] Adopt HiveServer2 client C++ codebase > --- > > Key: ARROW-3050 > URL: https://issues.apache.org/jira/browse/ARROW-3050 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Wes McKinney >Priority: Major > > I helped develop a small C++/Python library for interacting with databases > like Hive and Impala via the HiveServer2 Thrift protocol and making them > accessible to Python / pandas: > https://github.com/cloudera/hs2client > Internally this interfaces with HS2's own columnar representation. Arrow is a > natural partner for this project, much of which could be discarded. I think > Arrow would make as much sense as any place to develop this codebase further. > It could be later split off into a new project if a large enough community > develops > cc [~twmarshall] [~mjacobs] for thoughts > If we did this, do we need to do a software grant (essentially what I'm > proposing is to fork)? Can we just attribute the original Cloudera authors in > LICENSE.txt? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3059) [C++] Streamline namespace array::test
[ https://issues.apache.org/jira/browse/ARROW-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-3059: -- Labels: pull-request-available (was: ) > [C++] Streamline namespace array::test > -- > > Key: ARROW-3059 > URL: https://issues.apache.org/jira/browse/ARROW-3059 > Project: Apache Arrow > Issue Type: Task > Components: C++ >Affects Versions: 0.10.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > > Currently we have some test helpers that live in the {{arrow::test}} > namespace, some in {{arrow}} (or topic subnamespaces such as {{arrow::io}}). > I see no reason for the discrepancy. > I propose the simple solution of removing the {{arrow::test}} namespace > altogether. If not desirable, then we should make sure we put all helpers in > that namespace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-3059) [C++] Streamline namespace array::test
[ https://issues.apache.org/jira/browse/ARROW-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-3059: - Assignee: Antoine Pitrou > [C++] Streamline namespace array::test > -- > > Key: ARROW-3059 > URL: https://issues.apache.org/jira/browse/ARROW-3059 > Project: Apache Arrow > Issue Type: Task > Components: C++ >Affects Versions: 0.10.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Major > > Currently we have some test helpers that live in the {{arrow::test}} > namespace, some in {{arrow}} (or topic subnamespaces such as {{arrow::io}}). > I see no reason for the discrepancy. > I propose the simple solution of removing the {{arrow::test}} namespace > altogether. If not desirable, then we should make sure we put all helpers in > that namespace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3063) [Go] move list of supported/TODO features to confluence
Sebastien Binet created ARROW-3063: -- Summary: [Go] move list of supported/TODO features to confluence Key: ARROW-3063 URL: https://issues.apache.org/jira/browse/ARROW-3063 Project: Apache Arrow Issue Type: Improvement Components: Go Reporter: Sebastien Binet as mentioned in https://github.com/apache/arrow/pull/2421#discussion_r210033779 we should move the list of supported features (and those that still need to be implemented) to confluence. filing this so we don't forget about it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)