[jira] [Commented] (ARROW-5507) [Plasma] [CUDA] Compile error
[ https://issues.apache.org/jira/browse/ARROW-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855531#comment-16855531 ] Antoine Pitrou commented on ARROW-5507: --- Probably introduced in ARROW-5365. > [Plasma] [CUDA] Compile error > - > > Key: ARROW-5507 > URL: https://issues.apache.org/jira/browse/ARROW-5507 > Project: Apache Arrow > Issue Type: Bug > Components: C++ - Plasma, GPU >Reporter: Antoine Pitrou >Priority: Critical > > I'm starting getting this today: > {code} > ../src/plasma/protocol.cc:546:55: error: no matching member function for call > to 'CreateVector' > handles.push_back(fb::CreateCudaHandle(fbb, fbb.CreateVector(handle))); > ^~~~ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1484:27: > note: candidate function not viable: no known conversion from > 'std::shared_ptr' to 'const std::vector' for 1st argument > Offset> CreateVector(const std::vector ) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1477:42: > note: candidate template ignored: could not match 'vector' against > 'shared_ptr' > template Offset> CreateVector(const std::vector > ) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1443:42: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > template Offset> CreateVector(const T *v, size_t len) > { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1465:29: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > Offset>> CreateVector(const Offset *v, size_t len) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1501:42: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > template Offset> CreateVector(size_t vector_size, > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1520:21: > note: candidate function template not viable: requires 3 arguments, but 1 > was provided > Offset> CreateVector(size_t vector_size, F f, S *state) { > ^ > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-5485) [Gandiva][Crossbow] OSx builds failing
[ https://issues.apache.org/jira/browse/ARROW-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855631#comment-16855631 ] Praveen Kumar Desabandu commented on ARROW-5485: [~wesmckinn] I think this started when we switched to using shared gtest library. We get the following error : (happens both locally and in travis when building gtest from source) dyld: Library not loaded: libgtest_main.dylib Referenced from: /Users/travis/build/[secure]/arrow-build/arrow/cpp/build/./release/gandiva-decimal_test Reason: image not found dev/tasks/gandiva-jars/build-cpp-osx.sh: line 45: 5626 Abort trap: 6 All tests failed with this error. Maybe the rpath for the source code build library is not being set correctly? I ran it with the cmake rpath flag to on but it did not help. I am planning to turn tests off in OsX Crossbow if this is not a quick fix. > [Gandiva][Crossbow] OSx builds failing > -- > > Key: ARROW-5485 > URL: https://issues.apache.org/jira/browse/ARROW-5485 > Project: Apache Arrow > Issue Type: Task > Components: Packaging >Affects Versions: 0.14.0 >Reporter: Praveen Kumar Desabandu >Assignee: Praveen Kumar Desabandu >Priority: Major > Fix For: 0.14.0 > > > OSX builds are failing for the last 3 days. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5334) [C++] Add "Type" to names of arrow::Integer, arrow::FloatingPoint classes for consistency
[ https://issues.apache.org/jira/browse/ARROW-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5334: -- Labels: pull-request-available (was: ) > [C++] Add "Type" to names of arrow::Integer, arrow::FloatingPoint classes for > consistency > - > > Key: ARROW-5334 > URL: https://issues.apache.org/jira/browse/ARROW-5334 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > > These intermediate classes used for template metaprogramming (in particular, > {{std::is_base_of}}) have inconsistent names with the rest of data types. For > clarity, I think we should add "Type" to these class names and others like > them > Please do after ARROW-3144 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ARROW-5236) [Python] hdfs.connect() is trying to load libjvm in windows
[ https://issues.apache.org/jira/browse/ARROW-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855674#comment-16855674 ] Urmila edited comment on ARROW-5236 at 6/4/19 1:06 PM: --- Hi, I am also facing same issue. I have conda and spark installed on my local machine and trying to connect HDFS as mentioned below import pyarrow as pa fs = pa.hdfs.connect('hostname.xx.xx.com', port_number, user='a...@xyx.com', kerb_ticket='local machine path') Traceback (most recent call last): File "", line 1, in File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line 183, in connect extra_conf=extra_conf) File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line 37, in init self._connect(host, port, user, kerb_ticket, driver, extra_conf) File "pyarrow\io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status pyarrow.lib.ArrowIOError: Unable to load libjvm was (Author: urmilarv): Hi, I am also facing same issue, but could not find issue fix details anya ny of JIRA ARROW-5236 OR 4215. Please help. I have conda and spark installed on my local machine and trying to connect HDFS as mentioned below import pyarrow as pa fs = pa.hdfs.connect('hostname.xx.xx.com', port_number, user='a...@xyx.com', kerb_ticket='local machine path') Traceback (most recent call last): File "", line 1, in File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line 183, in connect extra_conf=extra_conf) File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line 37, in init self._connect(host, port, user, kerb_ticket, driver, extra_conf) File "pyarrow\io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status pyarrow.lib.ArrowIOError: Unable to load libjvm > [Python] hdfs.connect() is trying to load libjvm in windows > --- > > Key: ARROW-5236 > URL: https://issues.apache.org/jira/browse/ARROW-5236 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Environment: Windows 7 Enterprise, pyarrow 0.13.0 >Reporter: Kamaraju >Priority: Major > Labels: hdfs > > This issue was originally reported at > [https://github.com/apache/arrow/issues/4215] . Raising a Jira as per Wes > McKinney's request. > Summary: > The following script > {code} > $ cat expt2.py > import pyarrow as pa > fs = pa.hdfs.connect() > {code} > tries to load libjvm in windows 7 which is not expected. > {noformat} > $ python ./expt2.py > Traceback (most recent call last): > File "./expt2.py", line 3, in > fs = pa.hdfs.connect() > File > "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py", > line 183, in connect > extra_conf=extra_conf) > File > "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py", > line 37, in __init__ > self._connect(host, port, user, kerb_ticket, driver, extra_conf) > File "pyarrow\io-hdfs.pxi", line 89, in > pyarrow.lib.HadoopFileSystem._connect > File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status > pyarrow.lib.ArrowIOError: Unable to load libjvm > {noformat} > There is no libjvm file in Windows Java installation. > {noformat} > $ echo $JAVA_HOME > C:\Progra~1\Java\jdk1.8.0_141 > $ find $JAVA_HOME -iname '*libjvm*' > > {noformat} > I see the libjvm error with both 0.11.1 and 0.13.0 versions of pyarrow. > Steps to reproduce the issue (with more details): > Create the environment > {noformat} > $ cat scratch_py36_pyarrow.yml > name: scratch_py36_pyarrow > channels: > - defaults > dependencies: > - python=3.6.8 > - pyarrow > {noformat} > {noformat} > $ conda env create -f scratch_py36_pyarrow.yml > {noformat} > Apply the following patch to lib/site-packages/pyarrow/hdfs.py . I had to do > this since the Hadoop installation that comes with MapR <[https://mapr.com/]> > windows client only has $HADOOP_HOME/bin/hadoop.cmd . There is no file named > $HADOOP_HOME/bin/hadoop and so the subsequent subprocess.check_output call > fails with FileNotFoundError if this patch is not applied. > {noformat} > $ cat ~/x/patch.txt > 131c131 > < hadoop_bin = '{0}/bin/hadoop'.format(os.environ['HADOOP_HOME']) > --- > > hadoop_bin = '{0}/bin/hadoop.cmd'.format(os.environ['HADOOP_HOME']) > $ patch > /c/ProgramData/Continuum/Anaconda/envs/scratch_py36_pyarrow/lib/site-packages/pyarrow/hdfs.py > ~/x/patch.txt > patching file > /c/ProgramData/Continuum/Anaconda/envs/scratch_py36_pyarrow/lib/site-packages/pyarrow/hdfs.py > {noformat} > Activate the environment > {noformat} > $ source activate scratch_py36_pyarrow > {noformat} > Sample script > {noformat} > $
[jira] [Commented] (ARROW-5236) [Python] hdfs.connect() is trying to load libjvm in windows
[ https://issues.apache.org/jira/browse/ARROW-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855674#comment-16855674 ] Urmila commented on ARROW-5236: --- Hi, I am also facing same issue, but could not find issue fix details anya ny of JIRA ARROW-5236 OR 4215. Please help. I have conda and spark installed on my local machine and trying to connect HDFS as mentioned below import pyarrow as pa fs = pa.hdfs.connect('hostname.xx.xx.com', port_number, user='a...@xyx.com', kerb_ticket='local machine path') Traceback (most recent call last): File "", line 1, in File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line 183, in connect extra_conf=extra_conf) File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line 37, in init self._connect(host, port, user, kerb_ticket, driver, extra_conf) File "pyarrow\io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status pyarrow.lib.ArrowIOError: Unable to load libjvm > [Python] hdfs.connect() is trying to load libjvm in windows > --- > > Key: ARROW-5236 > URL: https://issues.apache.org/jira/browse/ARROW-5236 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Environment: Windows 7 Enterprise, pyarrow 0.13.0 >Reporter: Kamaraju >Priority: Major > Labels: hdfs > > This issue was originally reported at > [https://github.com/apache/arrow/issues/4215] . Raising a Jira as per Wes > McKinney's request. > Summary: > The following script > {code} > $ cat expt2.py > import pyarrow as pa > fs = pa.hdfs.connect() > {code} > tries to load libjvm in windows 7 which is not expected. > {noformat} > $ python ./expt2.py > Traceback (most recent call last): > File "./expt2.py", line 3, in > fs = pa.hdfs.connect() > File > "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py", > line 183, in connect > extra_conf=extra_conf) > File > "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py", > line 37, in __init__ > self._connect(host, port, user, kerb_ticket, driver, extra_conf) > File "pyarrow\io-hdfs.pxi", line 89, in > pyarrow.lib.HadoopFileSystem._connect > File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status > pyarrow.lib.ArrowIOError: Unable to load libjvm > {noformat} > There is no libjvm file in Windows Java installation. > {noformat} > $ echo $JAVA_HOME > C:\Progra~1\Java\jdk1.8.0_141 > $ find $JAVA_HOME -iname '*libjvm*' > > {noformat} > I see the libjvm error with both 0.11.1 and 0.13.0 versions of pyarrow. > Steps to reproduce the issue (with more details): > Create the environment > {noformat} > $ cat scratch_py36_pyarrow.yml > name: scratch_py36_pyarrow > channels: > - defaults > dependencies: > - python=3.6.8 > - pyarrow > {noformat} > {noformat} > $ conda env create -f scratch_py36_pyarrow.yml > {noformat} > Apply the following patch to lib/site-packages/pyarrow/hdfs.py . I had to do > this since the Hadoop installation that comes with MapR <[https://mapr.com/]> > windows client only has $HADOOP_HOME/bin/hadoop.cmd . There is no file named > $HADOOP_HOME/bin/hadoop and so the subsequent subprocess.check_output call > fails with FileNotFoundError if this patch is not applied. > {noformat} > $ cat ~/x/patch.txt > 131c131 > < hadoop_bin = '{0}/bin/hadoop'.format(os.environ['HADOOP_HOME']) > --- > > hadoop_bin = '{0}/bin/hadoop.cmd'.format(os.environ['HADOOP_HOME']) > $ patch > /c/ProgramData/Continuum/Anaconda/envs/scratch_py36_pyarrow/lib/site-packages/pyarrow/hdfs.py > ~/x/patch.txt > patching file > /c/ProgramData/Continuum/Anaconda/envs/scratch_py36_pyarrow/lib/site-packages/pyarrow/hdfs.py > {noformat} > Activate the environment > {noformat} > $ source activate scratch_py36_pyarrow > {noformat} > Sample script > {noformat} > $ cat expt2.py > import pyarrow as pa > fs = pa.hdfs.connect() > {noformat} > Execute the script > {noformat} > $ python ./expt2.py > Traceback (most recent call last): > File "./expt2.py", line 3, in > fs = pa.hdfs.connect() > File > "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py", > line 183, in connect > extra_conf=extra_conf) > File > "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py", > line 37, in __init__ > self._connect(host, port, user, kerb_ticket, driver, extra_conf) > File "pyarrow\io-hdfs.pxi", line 89, in > pyarrow.lib.HadoopFileSystem._connect > File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status > pyarrow.lib.ArrowIOError: Unable to load libjvm > {noformat} -- This message was sent by Atlassian JIRA
[jira] [Assigned] (ARROW-5020) [C++][Gandiva] Split Gandiva-related conda packages for builds into separate .yml conda env file
[ https://issues.apache.org/jira/browse/ARROW-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-5020: - Assignee: Antoine Pitrou > [C++][Gandiva] Split Gandiva-related conda packages for builds into separate > .yml conda env file > > > Key: ARROW-5020 > URL: https://issues.apache.org/jira/browse/ARROW-5020 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Continuous Integration >Reporter: Wes McKinney >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > Time Spent: 20m > Remaining Estimate: 0h > > These installs are large and should not be required unconditionally in CI and > elsewhere -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-5491) [C++] Remove unecessary semicolons following MACRO definitions
[ https://issues.apache.org/jira/browse/ARROW-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855693#comment-16855693 ] Antoine Pitrou commented on ARROW-5491: --- This is fixed, no? > [C++] Remove unecessary semicolons following MACRO definitions > -- > > Key: ARROW-5491 > URL: https://issues.apache.org/jira/browse/ARROW-5491 > Project: Apache Arrow > Issue Type: Task > Components: C++ >Affects Versions: 0.13.0 >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5507) [Plasma] [C++] Compile error
Antoine Pitrou created ARROW-5507: - Summary: [Plasma] [C++] Compile error Key: ARROW-5507 URL: https://issues.apache.org/jira/browse/ARROW-5507 Project: Apache Arrow Issue Type: Bug Components: C++ - Plasma Reporter: Antoine Pitrou I'm starting getting this today: {code} ../src/plasma/protocol.cc:546:55: error: no matching member function for call to 'CreateVector' handles.push_back(fb::CreateCudaHandle(fbb, fbb.CreateVector(handle))); ^~~~ /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1484:27: note: candidate function not viable: no known conversion from 'std::shared_ptr' to 'const std::vector' for 1st argument Offset> CreateVector(const std::vector ) { ^ /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1477:42: note: candidate template ignored: could not match 'vector' against 'shared_ptr' template Offset> CreateVector(const std::vector ) { ^ /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1443:42: note: candidate function template not viable: requires 2 arguments, but 1 was provided template Offset> CreateVector(const T *v, size_t len) { ^ /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1465:29: note: candidate function template not viable: requires 2 arguments, but 1 was provided Offset>> CreateVector(const Offset *v, size_t len) { ^ /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1501:42: note: candidate function template not viable: requires 2 arguments, but 1 was provided template Offset> CreateVector(size_t vector_size, ^ /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1520:21: note: candidate function template not viable: requires 3 arguments, but 1 was provided Offset> CreateVector(size_t vector_size, F f, S *state) { ^ {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5507) [Plasma] [CUDA] Compile error
[ https://issues.apache.org/jira/browse/ARROW-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5507: -- Labels: pull-request-available (was: ) > [Plasma] [CUDA] Compile error > - > > Key: ARROW-5507 > URL: https://issues.apache.org/jira/browse/ARROW-5507 > Project: Apache Arrow > Issue Type: Bug > Components: C++ - Plasma, GPU >Reporter: Antoine Pitrou >Priority: Critical > Labels: pull-request-available > > I'm starting getting this today: > {code} > ../src/plasma/protocol.cc:546:55: error: no matching member function for call > to 'CreateVector' > handles.push_back(fb::CreateCudaHandle(fbb, fbb.CreateVector(handle))); > ^~~~ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1484:27: > note: candidate function not viable: no known conversion from > 'std::shared_ptr' to 'const std::vector' for 1st argument > Offset> CreateVector(const std::vector ) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1477:42: > note: candidate template ignored: could not match 'vector' against > 'shared_ptr' > template Offset> CreateVector(const std::vector > ) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1443:42: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > template Offset> CreateVector(const T *v, size_t len) > { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1465:29: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > Offset>> CreateVector(const Offset *v, size_t len) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1501:42: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > template Offset> CreateVector(size_t vector_size, > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1520:21: > note: candidate function template not viable: requires 3 arguments, but 1 > was provided > Offset> CreateVector(size_t vector_size, F f, S *state) { > ^ > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5507) [Plasma] [CUDA] Compile error
[ https://issues.apache.org/jira/browse/ARROW-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-5507: -- Component/s: GPU > [Plasma] [CUDA] Compile error > - > > Key: ARROW-5507 > URL: https://issues.apache.org/jira/browse/ARROW-5507 > Project: Apache Arrow > Issue Type: Bug > Components: C++ - Plasma, GPU >Reporter: Antoine Pitrou >Priority: Critical > > I'm starting getting this today: > {code} > ../src/plasma/protocol.cc:546:55: error: no matching member function for call > to 'CreateVector' > handles.push_back(fb::CreateCudaHandle(fbb, fbb.CreateVector(handle))); > ^~~~ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1484:27: > note: candidate function not viable: no known conversion from > 'std::shared_ptr' to 'const std::vector' for 1st argument > Offset> CreateVector(const std::vector ) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1477:42: > note: candidate template ignored: could not match 'vector' against > 'shared_ptr' > template Offset> CreateVector(const std::vector > ) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1443:42: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > template Offset> CreateVector(const T *v, size_t len) > { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1465:29: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > Offset>> CreateVector(const Offset *v, size_t len) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1501:42: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > template Offset> CreateVector(size_t vector_size, > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1520:21: > note: candidate function template not viable: requires 3 arguments, but 1 > was provided > Offset> CreateVector(size_t vector_size, F f, S *state) { > ^ > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-5334) [C++] Add "Type" to names of arrow::Integer, arrow::FloatingPoint classes for consistency
[ https://issues.apache.org/jira/browse/ARROW-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-5334: - Assignee: Antoine Pitrou (was: Wes McKinney) > [C++] Add "Type" to names of arrow::Integer, arrow::FloatingPoint classes for > consistency > - > > Key: ARROW-5334 > URL: https://issues.apache.org/jira/browse/ARROW-5334 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Assignee: Antoine Pitrou >Priority: Major > Fix For: 0.14.0 > > > These intermediate classes used for template metaprogramming (in particular, > {{std::is_base_of}}) have inconsistent names with the rest of data types. For > clarity, I think we should add "Type" to these class names and others like > them > Please do after ARROW-3144 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ARROW-5236) [Python] hdfs.connect() is trying to load libjvm in windows
[ https://issues.apache.org/jira/browse/ARROW-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855674#comment-16855674 ] Urmila edited comment on ARROW-5236 at 6/4/19 1:07 PM: --- Hi, I am also facing same issue. I have conda and spark installed on my local windows machine and trying to connect HDFS (unix) as mentioned below import pyarrow as pa fs = pa.hdfs.connect('hostname.xx.xx.com', port_number, user='a...@xyx.com', kerb_ticket='local machine path') Traceback (most recent call last): File "", line 1, in File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line 183, in connect extra_conf=extra_conf) File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line 37, in init self._connect(host, port, user, kerb_ticket, driver, extra_conf) File "pyarrow\io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status pyarrow.lib.ArrowIOError: Unable to load libjvm was (Author: urmilarv): Hi, I am also facing same issue. I have conda and spark installed on my local machine and trying to connect HDFS as mentioned below import pyarrow as pa fs = pa.hdfs.connect('hostname.xx.xx.com', port_number, user='a...@xyx.com', kerb_ticket='local machine path') Traceback (most recent call last): File "", line 1, in File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line 183, in connect extra_conf=extra_conf) File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line 37, in init self._connect(host, port, user, kerb_ticket, driver, extra_conf) File "pyarrow\io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status pyarrow.lib.ArrowIOError: Unable to load libjvm > [Python] hdfs.connect() is trying to load libjvm in windows > --- > > Key: ARROW-5236 > URL: https://issues.apache.org/jira/browse/ARROW-5236 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Environment: Windows 7 Enterprise, pyarrow 0.13.0 >Reporter: Kamaraju >Priority: Major > Labels: hdfs > > This issue was originally reported at > [https://github.com/apache/arrow/issues/4215] . Raising a Jira as per Wes > McKinney's request. > Summary: > The following script > {code} > $ cat expt2.py > import pyarrow as pa > fs = pa.hdfs.connect() > {code} > tries to load libjvm in windows 7 which is not expected. > {noformat} > $ python ./expt2.py > Traceback (most recent call last): > File "./expt2.py", line 3, in > fs = pa.hdfs.connect() > File > "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py", > line 183, in connect > extra_conf=extra_conf) > File > "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py", > line 37, in __init__ > self._connect(host, port, user, kerb_ticket, driver, extra_conf) > File "pyarrow\io-hdfs.pxi", line 89, in > pyarrow.lib.HadoopFileSystem._connect > File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status > pyarrow.lib.ArrowIOError: Unable to load libjvm > {noformat} > There is no libjvm file in Windows Java installation. > {noformat} > $ echo $JAVA_HOME > C:\Progra~1\Java\jdk1.8.0_141 > $ find $JAVA_HOME -iname '*libjvm*' > > {noformat} > I see the libjvm error with both 0.11.1 and 0.13.0 versions of pyarrow. > Steps to reproduce the issue (with more details): > Create the environment > {noformat} > $ cat scratch_py36_pyarrow.yml > name: scratch_py36_pyarrow > channels: > - defaults > dependencies: > - python=3.6.8 > - pyarrow > {noformat} > {noformat} > $ conda env create -f scratch_py36_pyarrow.yml > {noformat} > Apply the following patch to lib/site-packages/pyarrow/hdfs.py . I had to do > this since the Hadoop installation that comes with MapR <[https://mapr.com/]> > windows client only has $HADOOP_HOME/bin/hadoop.cmd . There is no file named > $HADOOP_HOME/bin/hadoop and so the subsequent subprocess.check_output call > fails with FileNotFoundError if this patch is not applied. > {noformat} > $ cat ~/x/patch.txt > 131c131 > < hadoop_bin = '{0}/bin/hadoop'.format(os.environ['HADOOP_HOME']) > --- > > hadoop_bin = '{0}/bin/hadoop.cmd'.format(os.environ['HADOOP_HOME']) > $ patch > /c/ProgramData/Continuum/Anaconda/envs/scratch_py36_pyarrow/lib/site-packages/pyarrow/hdfs.py > ~/x/patch.txt > patching file > /c/ProgramData/Continuum/Anaconda/envs/scratch_py36_pyarrow/lib/site-packages/pyarrow/hdfs.py > {noformat} > Activate the environment > {noformat} > $ source activate scratch_py36_pyarrow > {noformat} > Sample script > {noformat} > $ cat expt2.py > import pyarrow as pa > fs = pa.hdfs.connect() > {noformat}
[jira] [Commented] (ARROW-5334) [C++] Add "Type" to names of arrow::Integer, arrow::FloatingPoint classes for consistency
[ https://issues.apache.org/jira/browse/ARROW-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855679#comment-16855679 ] Antoine Pitrou commented on ARROW-5334: --- Applies to {{Number}} as well. > [C++] Add "Type" to names of arrow::Integer, arrow::FloatingPoint classes for > consistency > - > > Key: ARROW-5334 > URL: https://issues.apache.org/jira/browse/ARROW-5334 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Assignee: Antoine Pitrou >Priority: Major > Fix For: 0.14.0 > > > These intermediate classes used for template metaprogramming (in particular, > {{std::is_base_of}}) have inconsistent names with the rest of data types. For > clarity, I think we should add "Type" to these class names and others like > them > Please do after ARROW-3144 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3779) [C++/Python] Validate timezone passed to pa.timestamp
[ https://issues.apache.org/jira/browse/ARROW-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855723#comment-16855723 ] Antoine Pitrou commented on ARROW-3779: --- Validating the timezone implies we have access to the Olson database or something similar. Not sure this is a priority for us given the amount of scaffolding that will probably be required in the build chain. > [C++/Python] Validate timezone passed to pa.timestamp > - > > Key: ARROW-3779 > URL: https://issues.apache.org/jira/browse/ARROW-3779 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Krisztian Szucs >Priority: Major > Fix For: 0.14.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3779) [C++/Python] Validate timezone passed to pa.timestamp
[ https://issues.apache.org/jira/browse/ARROW-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-3779: -- Priority: Minor (was: Major) > [C++/Python] Validate timezone passed to pa.timestamp > - > > Key: ARROW-3779 > URL: https://issues.apache.org/jira/browse/ARROW-3779 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Krisztian Szucs >Priority: Minor > Fix For: 0.14.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3779) [C++/Python] Validate timezone passed to pa.timestamp
[ https://issues.apache.org/jira/browse/ARROW-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-3779: -- Fix Version/s: (was: 0.14.0) > [C++/Python] Validate timezone passed to pa.timestamp > - > > Key: ARROW-3779 > URL: https://issues.apache.org/jira/browse/ARROW-3779 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Krisztian Szucs >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-5491) [C++] Remove unecessary semicolons following MACRO definitions
[ https://issues.apache.org/jira/browse/ARROW-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette resolved ARROW-5491. -- Resolution: Fixed > [C++] Remove unecessary semicolons following MACRO definitions > -- > > Key: ARROW-5491 > URL: https://issues.apache.org/jira/browse/ARROW-5491 > Project: Apache Arrow > Issue Type: Task > Components: C++ >Affects Versions: 0.13.0 >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3877) [C++] Provide access to "maximum decompressed size" functions in compression libraries (if they exist)
[ https://issues.apache.org/jira/browse/ARROW-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855722#comment-16855722 ] Antoine Pitrou commented on ARROW-3877: --- Do we have a use case currently for one-shot decompression without knowing the decompressed length? Compressed files are read using streaming decompression (which is more reasonable for huge data anyway). > [C++] Provide access to "maximum decompressed size" functions in compression > libraries (if they exist) > -- > > Key: ARROW-3877 > URL: https://issues.apache.org/jira/browse/ARROW-3877 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 0.14.0 > > > As follow up to ARROW-3831, some compression libraries have a function to > provide a hint for sizing the output buffer (if it is not known already) for > one-shot decompression. This would be helpful for sizing allocations in such > cases -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5285) [C++][Plasma] GpuProcessHandle is not released when GPU object deleted
[ https://issues.apache.org/jira/browse/ARROW-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-5285: -- Component/s: GPU C++ - Plasma > [C++][Plasma] GpuProcessHandle is not released when GPU object deleted > -- > > Key: ARROW-5285 > URL: https://issues.apache.org/jira/browse/ARROW-5285 > Project: Apache Arrow > Issue Type: Bug > Components: C++, C++ - Plasma, GPU >Affects Versions: 0.13.0 >Reporter: shengjun.li >Assignee: shengjun.li >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > cpp/CMakeLists.txt > option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" > ON) > option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) > In the plasma client, GpuProcessHandle is never released although GPU object > is deleted. > Thus, cuIpcCloseMemHandle is never called. > When I repeatly creat and delete gpu memory, the following error may occur. > IOError: Cuda Driver API call in > /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 155 failed with > code 208: cuIpcOpenMemHandle(, *handle, > CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) > Note: CUDA_ERROR_ALREADY_MAPPED = 208 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-5488) [R] Workaround when C++ lib not available
[ https://issues.apache.org/jira/browse/ARROW-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855730#comment-16855730 ] Romain François commented on ARROW-5488: That sounds easier than what I am currently trying to do :) > [R] Workaround when C++ lib not available > - > > Key: ARROW-5488 > URL: https://issues.apache.org/jira/browse/ARROW-5488 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Romain François >Priority: Major > > As a way to get to CRAN, we need some way for the package still compile and > install and test (although do nothing useful) even when the c++ lib is not > available. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-5331) [C++] FlightDataStream should be higher-level
[ https://issues.apache.org/jira/browse/ARROW-5331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855776#comment-16855776 ] Antoine Pitrou commented on ARROW-5331: --- I had overlooked that the {{FlightDescriptor}} is unused where {{FlightDataStream}} is concerned. {{FlightDataStream}} is used for the server's {{DoGet}} implementation only. So {{RecordBatchStream}} should already be sufficient in all cases where a record batches-only data stream is desired. This leaves the question of heterogenous Flight streams. They should be handled at the IPC layer first before adapting Flight to work with them. We probably need some kind of IPC {{Datum}} that can represent several different kinds of data (record batch, tensor...). > [C++] FlightDataStream should be higher-level > - > > Key: ARROW-5331 > URL: https://issues.apache.org/jira/browse/ARROW-5331 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, FlightRPC >Affects Versions: 0.13.0 >Reporter: Antoine Pitrou >Priority: Major > > Currently, {{FlightDataStream}} is expected to provide {{FlightPayload}} > objects. This requires the user to handle IPC serialization themselves. > Instead, it could provide higher-level {{FlightData}} objects (perhaps a > simple struct containing a {{FlightDescriptor}} and a {{RecordBatch}}), > letting Flight handle IPC encoding. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3877) [C++] Provide access to "maximum decompressed size" functions in compression libraries (if they exist)
[ https://issues.apache.org/jira/browse/ARROW-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-3877: -- Fix Version/s: (was: 0.14.0) > [C++] Provide access to "maximum decompressed size" functions in compression > libraries (if they exist) > -- > > Key: ARROW-3877 > URL: https://issues.apache.org/jira/browse/ARROW-3877 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Minor > > As follow up to ARROW-3831, some compression libraries have a function to > provide a hint for sizing the output buffer (if it is not known already) for > one-shot decompression. This would be helpful for sizing allocations in such > cases -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-5334) [C++] Add "Type" to names of arrow::Integer, arrow::FloatingPoint classes for consistency
[ https://issues.apache.org/jira/browse/ARROW-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-5334. --- Resolution: Fixed Issue resolved by pull request 4470 [https://github.com/apache/arrow/pull/4470] > [C++] Add "Type" to names of arrow::Integer, arrow::FloatingPoint classes for > consistency > - > > Key: ARROW-5334 > URL: https://issues.apache.org/jira/browse/ARROW-5334 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > Time Spent: 20m > Remaining Estimate: 0h > > These intermediate classes used for template metaprogramming (in particular, > {{std::is_base_of}}) have inconsistent names with the rest of data types. For > clarity, I think we should add "Type" to these class names and others like > them > Please do after ARROW-3144 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3877) [C++] Provide access to "maximum decompressed size" functions in compression libraries (if they exist)
[ https://issues.apache.org/jira/browse/ARROW-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855783#comment-16855783 ] Antoine Pitrou commented on ARROW-3877: --- If it's only to provide a compression toolbox to Python users then I think this is low priority. > [C++] Provide access to "maximum decompressed size" functions in compression > libraries (if they exist) > -- > > Key: ARROW-3877 > URL: https://issues.apache.org/jira/browse/ARROW-3877 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 0.14.0 > > > As follow up to ARROW-3831, some compression libraries have a function to > provide a hint for sizing the output buffer (if it is not known already) for > one-shot decompression. This would be helpful for sizing allocations in such > cases -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3877) [C++] Provide access to "maximum decompressed size" functions in compression libraries (if they exist)
[ https://issues.apache.org/jira/browse/ARROW-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855784#comment-16855784 ] Wes McKinney commented on ARROW-3877: - Agreed > [C++] Provide access to "maximum decompressed size" functions in compression > libraries (if they exist) > -- > > Key: ARROW-3877 > URL: https://issues.apache.org/jira/browse/ARROW-3877 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Minor > > As follow up to ARROW-3831, some compression libraries have a function to > provide a hint for sizing the output buffer (if it is not known already) for > one-shot decompression. This would be helpful for sizing allocations in such > cases -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3877) [C++] Provide access to "maximum decompressed size" functions in compression libraries (if they exist)
[ https://issues.apache.org/jira/browse/ARROW-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-3877: -- Priority: Minor (was: Major) > [C++] Provide access to "maximum decompressed size" functions in compression > libraries (if they exist) > -- > > Key: ARROW-3877 > URL: https://issues.apache.org/jira/browse/ARROW-3877 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Minor > Fix For: 0.14.0 > > > As follow up to ARROW-3831, some compression libraries have a function to > provide a hint for sizing the output buffer (if it is not known already) for > one-shot decompression. This would be helpful for sizing allocations in such > cases -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-5507) [Plasma] [CUDA] Compile error
[ https://issues.apache.org/jira/browse/ARROW-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-5507: - Assignee: Antoine Pitrou > [Plasma] [CUDA] Compile error > - > > Key: ARROW-5507 > URL: https://issues.apache.org/jira/browse/ARROW-5507 > Project: Apache Arrow > Issue Type: Bug > Components: C++ - Plasma, GPU >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Critical > Labels: pull-request-available > Fix For: 0.14.0 > > Time Spent: 1h > Remaining Estimate: 0h > > I'm starting getting this today: > {code} > ../src/plasma/protocol.cc:546:55: error: no matching member function for call > to 'CreateVector' > handles.push_back(fb::CreateCudaHandle(fbb, fbb.CreateVector(handle))); > ^~~~ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1484:27: > note: candidate function not viable: no known conversion from > 'std::shared_ptr' to 'const std::vector' for 1st argument > Offset> CreateVector(const std::vector ) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1477:42: > note: candidate template ignored: could not match 'vector' against > 'shared_ptr' > template Offset> CreateVector(const std::vector > ) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1443:42: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > template Offset> CreateVector(const T *v, size_t len) > { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1465:29: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > Offset>> CreateVector(const Offset *v, size_t len) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1501:42: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > template Offset> CreateVector(size_t vector_size, > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1520:21: > note: candidate function template not viable: requires 3 arguments, but 1 > was provided > Offset> CreateVector(size_t vector_size, F f, S *state) { > ^ > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-5507) [Plasma] [CUDA] Compile error
[ https://issues.apache.org/jira/browse/ARROW-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-5507. --- Resolution: Fixed Fix Version/s: 0.14.0 Issue resolved by pull request 4468 [https://github.com/apache/arrow/pull/4468] > [Plasma] [CUDA] Compile error > - > > Key: ARROW-5507 > URL: https://issues.apache.org/jira/browse/ARROW-5507 > Project: Apache Arrow > Issue Type: Bug > Components: C++ - Plasma, GPU >Reporter: Antoine Pitrou >Priority: Critical > Labels: pull-request-available > Fix For: 0.14.0 > > Time Spent: 1h > Remaining Estimate: 0h > > I'm starting getting this today: > {code} > ../src/plasma/protocol.cc:546:55: error: no matching member function for call > to 'CreateVector' > handles.push_back(fb::CreateCudaHandle(fbb, fbb.CreateVector(handle))); > ^~~~ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1484:27: > note: candidate function not viable: no known conversion from > 'std::shared_ptr' to 'const std::vector' for 1st argument > Offset> CreateVector(const std::vector ) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1477:42: > note: candidate template ignored: could not match 'vector' against > 'shared_ptr' > template Offset> CreateVector(const std::vector > ) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1443:42: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > template Offset> CreateVector(const T *v, size_t len) > { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1465:29: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > Offset>> CreateVector(const Offset *v, size_t len) { > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1501:42: > note: candidate function template not viable: requires 2 arguments, but 1 > was provided > template Offset> CreateVector(size_t vector_size, > ^ > /home/antoine/miniconda3/envs/pyarrow/include/flatbuffers/flatbuffers.h:1520:21: > note: candidate function template not viable: requires 3 arguments, but 1 > was provided > Offset> CreateVector(size_t vector_size, F f, S *state) { > ^ > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-5285) [C++][Plasma] GpuProcessHandle is not released when GPU object deleted
[ https://issues.apache.org/jira/browse/ARROW-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-5285. --- Resolution: Fixed Issue resolved by pull request 4277 [https://github.com/apache/arrow/pull/4277] > [C++][Plasma] GpuProcessHandle is not released when GPU object deleted > -- > > Key: ARROW-5285 > URL: https://issues.apache.org/jira/browse/ARROW-5285 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 0.13.0 >Reporter: shengjun.li >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > cpp/CMakeLists.txt > option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" > ON) > option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) > In the plasma client, GpuProcessHandle is never released although GPU object > is deleted. > Thus, cuIpcCloseMemHandle is never called. > When I repeatly creat and delete gpu memory, the following error may occur. > IOError: Cuda Driver API call in > /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 155 failed with > code 208: cuIpcOpenMemHandle(, *handle, > CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) > Note: CUDA_ERROR_ALREADY_MAPPED = 208 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-5285) [C++][Plasma] GpuProcessHandle is not released when GPU object deleted
[ https://issues.apache.org/jira/browse/ARROW-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-5285: - Assignee: shengjun.li > [C++][Plasma] GpuProcessHandle is not released when GPU object deleted > -- > > Key: ARROW-5285 > URL: https://issues.apache.org/jira/browse/ARROW-5285 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 0.13.0 >Reporter: shengjun.li >Assignee: shengjun.li >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > cpp/CMakeLists.txt > option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" > ON) > option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) > In the plasma client, GpuProcessHandle is never released although GPU object > is deleted. > Thus, cuIpcCloseMemHandle is never called. > When I repeatly creat and delete gpu memory, the following error may occur. > IOError: Cuda Driver API call in > /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 155 failed with > code 208: cuIpcOpenMemHandle(, *handle, > CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) > Note: CUDA_ERROR_ALREADY_MAPPED = 208 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3298) [C++] Move murmur3 hash implementation to arrow/util
[ https://issues.apache.org/jira/browse/ARROW-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855725#comment-16855725 ] Antoine Pitrou commented on ARROW-3298: --- What complicates things a bit is that there are several different versions of Murmur (murmur2, murmur3, 32-bit-hash-producing, 64-bit-hash-producing) and also potentially several different implementations of each (with different performance characteristics). So some review of current usage accross the codebase (Arrow, Plasma, Parquet, Gandiva) is needed. [~fsaintjacques] > [C++] Move murmur3 hash implementation to arrow/util > > > Key: ARROW-3298 > URL: https://issues.apache.org/jira/browse/ARROW-3298 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 0.14.0 > > > It would be good to consolidate hashing utility code in a central place (this > is currently in src/parquet) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3877) [C++] Provide access to "maximum decompressed size" functions in compression libraries (if they exist)
[ https://issues.apache.org/jira/browse/ARROW-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855782#comment-16855782 ] Wes McKinney commented on ARROW-3877: - If we wanted our {{pyarrow.compress}} and {{pyarrow.decompress}} functions to be interchangeable with their counterparts in such libraries like python-snappy, it would be helpful to be able to invoke decompress without knowing the exact uncompressed length. Some compressors require the uncompressed length so in that case NotImplemented would be returned > [C++] Provide access to "maximum decompressed size" functions in compression > libraries (if they exist) > -- > > Key: ARROW-3877 > URL: https://issues.apache.org/jira/browse/ARROW-3877 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 0.14.0 > > > As follow up to ARROW-3831, some compression libraries have a function to > provide a hint for sizing the output buffer (if it is not known already) for > one-shot decompression. This would be helpful for sizing allocations in such > cases -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5488) [R] Workaround when C++ lib not available
[ https://issues.apache.org/jira/browse/ARROW-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5488: -- Labels: pull-request-available (was: ) > [R] Workaround when C++ lib not available > - > > Key: ARROW-5488 > URL: https://issues.apache.org/jira/browse/ARROW-5488 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Romain François >Priority: Major > Labels: pull-request-available > > As a way to get to CRAN, we need some way for the package still compile and > install and test (although do nothing useful) even when the c++ lib is not > available. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-5485) [Gandiva][Crossbow] OSx builds failing
[ https://issues.apache.org/jira/browse/ARROW-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855982#comment-16855982 ] Praveen Kumar Desabandu commented on ARROW-5485: [~wesmckinn] - any pointers will be highly appreciated and useful :) > [Gandiva][Crossbow] OSx builds failing > -- > > Key: ARROW-5485 > URL: https://issues.apache.org/jira/browse/ARROW-5485 > Project: Apache Arrow > Issue Type: Task > Components: Packaging >Affects Versions: 0.14.0 >Reporter: Praveen Kumar Desabandu >Assignee: Praveen Kumar Desabandu >Priority: Major > Fix For: 0.14.0 > > > OSX builds are failing for the last 3 days. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-5077) [Rust] Release process should change Cargo.toml to use release versions
[ https://issues.apache.org/jira/browse/ARROW-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sutou Kouhei resolved ARROW-5077. - Resolution: Fixed Issue resolved by pull request 4460 [https://github.com/apache/arrow/pull/4460] > [Rust] Release process should change Cargo.toml to use release versions > --- > > Key: ARROW-5077 > URL: https://issues.apache.org/jira/browse/ARROW-5077 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust >Affects Versions: 0.13.0 >Reporter: Andy Grove >Assignee: Yosuke Shiro >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > In the dev tree we use relative path dependencies between arrow, parquet, and > datafusion, which means we can't just run cargo publish for each crate from > the release source tarball. > It would be good to have the relaese packaging change the Cargo.toml for > parquet and datafusion to have dependencies on a versioned release instead of > a relative path to remove this manual step when publishing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5508) [C++] Create reusable Iterator interface
Wes McKinney created ARROW-5508: --- Summary: [C++] Create reusable Iterator interface Key: ARROW-5508 URL: https://issues.apache.org/jira/browse/ARROW-5508 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Wes McKinney Assignee: Wes McKinney Fix For: 0.14.0 We have various iterator-like classes. I envision a reusable interface like {code} template class Iterator { public: virtual ~Iterator() = default; virtual Status Next(T* out) = 0; } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5510) [Format] Feather V2
Neal Richardson created ARROW-5510: -- Summary: [Format] Feather V2 Key: ARROW-5510 URL: https://issues.apache.org/jira/browse/ARROW-5510 Project: Apache Arrow Issue Type: Improvement Components: Format Reporter: Neal Richardson Assignee: Wes McKinney Fix For: 0.14.0 The initial Feather file format is a minimal subset of the Arrow IPC format. It has a number of limitations (see [https://wesmckinney.com/blog/feather-arrow-future/]). We want to retain "feather" as the name of the on-disk representation of Arrow memory, so in order to support everything that Arrow supports, we need a "feather 2.0" format. IIUC, defining the file format is "done" (dump the memory to disk). Remaining issues include upgrading "feather" readers and writers in all languages to support both feather 1.0 and feather 2.0. (e.g. https://issues.apache.org/jira/browse/ARROW-5501) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-5501) [R] read/write_feather/arrow?
[ https://issues.apache.org/jira/browse/ARROW-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856132#comment-16856132 ] Neal Richardson commented on ARROW-5501: Created here: https://issues.apache.org/jira/browse/ARROW-5510 > [R] read/write_feather/arrow? > - > > Key: ARROW-5501 > URL: https://issues.apache.org/jira/browse/ARROW-5501 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Neal Richardson >Priority: Major > Fix For: 0.14.0 > > > read_feather and write_feather exist, and there is also write_arrow. But no > read_arrow. > Some questions (which go beyond just R): There's talk of a "feather 2.0", > i.e. "just" serializing the IPC format (which IIUC is what write_arrow does). > Are we going to continue to call the file format "Feather", and possibly > continue supporting the "feather 1.0" format as a subset/special case? Or > will "feather" mean this limited format and "arrow" be the name of the > full-featured file? > In terms of this issue, should write_arrow be folded into write_feather and > there be an argument for indicating which version to write? Or should the > distinction be maintained, and we need to add a read_arrow() function? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2447) [C++] Create a device abstraction
[ https://issues.apache.org/jira/browse/ARROW-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-2447: Fix Version/s: (was: 0.14.0) > [C++] Create a device abstraction > - > > Key: ARROW-2447 > URL: https://issues.apache.org/jira/browse/ARROW-2447 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, GPU >Affects Versions: 0.9.0 >Reporter: Antoine Pitrou >Assignee: Pearu Peterson >Priority: Major > > Right now, a plain Buffer doesn't carry information about where it actually > lies. That information also cannot be passed around, so you get APIs like > {{PlasmaClient}} which take or return device number integers, and have > implementations which hardcode operations on CUDA buffers. Also, unsuspecting > receivers of a {{Buffer}} pointer may try to act on the underlying memory > without knowing whether it's CPU-reachable or not. > Here is a sketch for a proposed Device abstraction: > {code} > class Device { > enum DeviceKind { KIND_CPU, KIND_CUDA }; > virtual DeviceKind kind() const; > //MemoryPool* default_memory_pool() const; > //std::shared_ptr Allocate(...); > }; > class CpuDevice : public Device {}; > class CudaDevice : public Device { > int device_num() const; > }; > class Buffer { > virtual DeviceKind device_kind() const; > virtual std::shared_ptr device() const; > virtual bool on_cpu() const { > return true; > } > const uint8_t* cpu_data() const { > return on_cpu() ? data() : nullptr; > } > uint8_t* cpu_mutable_data() { > return on_cpu() ? mutable_data() : nullptr; > } > virtual CopyToCpu(std::shared_ptr dest) const; > virtual CopyFromCpu(std::shared_ptr src); > }; > class CudaBuffer : public Buffer { > virtual bool on_cpu() const { > return false; > } > }; > CopyBuffer(std::shared_ptr dest, const std::shared_ptr src); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2801) [Python] Implement splt_row_groups for ParquetDataset
[ https://issues.apache.org/jira/browse/ARROW-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-2801: Fix Version/s: (was: 0.14.0) 0.15.0 > [Python] Implement splt_row_groups for ParquetDataset > - > > Key: ARROW-2801 > URL: https://issues.apache.org/jira/browse/ARROW-2801 > Project: Apache Arrow > Issue Type: New Feature > Components: Python >Reporter: Robbie Gruener >Assignee: Robbie Gruener >Priority: Minor > Labels: datasets, parquet, pull-request-available > Fix For: 0.15.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Currently the split_row_groups argument in ParquetDataset yields a not > implemented error. An easy and efficient way to implement this is by using > the summary metadata file instead of opening every footer file -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856229#comment-16856229 ] Liya Fan commented on ARROW-5508: - [~wesmckinn], thanks for the good point. What is the standard way to know if there is a next element in the iterator? > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Assignee: Wes McKinney >Priority: Major > Fix For: 0.14.0 > > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5511) [Packaging] Enable Flight in Conda packages
David Li created ARROW-5511: --- Summary: [Packaging] Enable Flight in Conda packages Key: ARROW-5511 URL: https://issues.apache.org/jira/browse/ARROW-5511 Project: Apache Arrow Issue Type: Improvement Components: C++, Packaging, Python Reporter: David Li Assignee: David Li Fix For: 0.14.0 We should build Conda packages with Flight enabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5511) [Packaging] Enable Flight in Conda packages
[ https://issues.apache.org/jira/browse/ARROW-5511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5511: -- Labels: pull-request-available (was: ) > [Packaging] Enable Flight in Conda packages > --- > > Key: ARROW-5511 > URL: https://issues.apache.org/jira/browse/ARROW-5511 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Packaging, Python >Reporter: David Li >Assignee: David Li >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > > We should build Conda packages with Flight enabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (ARROW-5285) [C++][Plasma] GpuProcessHandle is not released when GPU object deleted
[ https://issues.apache.org/jira/browse/ARROW-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shengjun.li closed ARROW-5285. -- It is fixed. > [C++][Plasma] GpuProcessHandle is not released when GPU object deleted > -- > > Key: ARROW-5285 > URL: https://issues.apache.org/jira/browse/ARROW-5285 > Project: Apache Arrow > Issue Type: Bug > Components: C++, C++ - Plasma, GPU >Affects Versions: 0.13.0 >Reporter: shengjun.li >Assignee: shengjun.li >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > cpp/CMakeLists.txt > option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" > ON) > option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) > In the plasma client, GpuProcessHandle is never released although GPU object > is deleted. > Thus, cuIpcCloseMemHandle is never called. > When I repeatly creat and delete gpu memory, the following error may occur. > IOError: Cuda Driver API call in > /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 155 failed with > code 208: cuIpcOpenMemHandle(, *handle, > CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) > Note: CUDA_ERROR_ALREADY_MAPPED = 208 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5509) [R] write_parquet()
Neal Richardson created ARROW-5509: -- Summary: [R] write_parquet() Key: ARROW-5509 URL: https://issues.apache.org/jira/browse/ARROW-5509 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Fix For: 0.14.0 We can read but not yet write. The C++ library supports this and pyarrow does it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-1837) [Java] Unable to read unsigned integers outside signed range for bit width in integration tests
[ https://issues.apache.org/jira/browse/ARROW-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-1837. - Resolution: Fixed Issue resolved by pull request 4432 [https://github.com/apache/arrow/pull/4432] > [Java] Unable to read unsigned integers outside signed range for bit width in > integration tests > --- > > Key: ARROW-1837 > URL: https://issues.apache.org/jira/browse/ARROW-1837 > Project: Apache Arrow > Issue Type: Bug > Components: Java >Reporter: Wes McKinney >Assignee: Micah Kornfield >Priority: Major > Labels: columnar-format-1.0, pull-request-available > Fix For: 0.14.0 > > Attachments: generated_primitive.json > > Time Spent: 4h 20m > Remaining Estimate: 0h > > I believe this was introduced recently (perhaps in the refactors), but there > was a problem where the integration tests weren't being properly run that hid > the error from us > see https://github.com/apache/arrow/pull/1294#issuecomment-345553066 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5485) [Gandiva][Crossbow] OSx builds failing
[ https://issues.apache.org/jira/browse/ARROW-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5485: -- Labels: pull-request-available (was: ) > [Gandiva][Crossbow] OSx builds failing > -- > > Key: ARROW-5485 > URL: https://issues.apache.org/jira/browse/ARROW-5485 > Project: Apache Arrow > Issue Type: Task > Components: Packaging >Affects Versions: 0.14.0 >Reporter: Praveen Kumar Desabandu >Assignee: Praveen Kumar Desabandu >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > > OSX builds are failing for the last 3 days. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5242) [C++] Arrow doesn't compile cleanly with Visual Studio 2017 Update 9 or later due to narrowing
[ https://issues.apache.org/jira/browse/ARROW-5242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5242: -- Labels: pull-request-available (was: ) > [C++] Arrow doesn't compile cleanly with Visual Studio 2017 Update 9 or later > due to narrowing > -- > > Key: ARROW-5242 > URL: https://issues.apache.org/jira/browse/ARROW-5242 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Billy Robert O'Neal III >Assignee: Billy Robert O'Neal III >Priority: Major > Labels: pull-request-available > > The std::string constructor call here is narrowing wchar_t to char, which > emits warning C4244 on current releases of Visual Studio: > [https://github.com/apache/arrow/blob/master/cpp/src/arrow/vendored/datetime/tz.cpp#L205] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5513) [Java] Refactor method name for getstartOffset to use camel case
[ https://issues.apache.org/jira/browse/ARROW-5513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liya Fan updated ARROW-5513: Description: The method getstartOffset in class org.apache.arrow.vector.BaseVariableWidthVector should be refactored to getStartOffset, to comply with the camel case. Fortunately, this method is not public, so the changes are internal to Arrow. > [Java] Refactor method name for getstartOffset to use camel case > > > Key: ARROW-5513 > URL: https://issues.apache.org/jira/browse/ARROW-5513 > Project: Apache Arrow > Issue Type: Improvement > Components: Java >Reporter: Liya Fan >Assignee: Liya Fan >Priority: Trivial > > The method getstartOffset in class > org.apache.arrow.vector.BaseVariableWidthVector should be refactored to > getStartOffset, to comply with the camel case. > Fortunately, this method is not public, so the changes are internal to Arrow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5335) [Python] Support for converting variable dictionaries to pandas
[ https://issues.apache.org/jira/browse/ARROW-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-5335: Priority: Blocker (was: Major) > [Python] Support for converting variable dictionaries to pandas > --- > > Key: ARROW-5335 > URL: https://issues.apache.org/jira/browse/ARROW-5335 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Wes McKinney >Priority: Blocker > Fix For: 0.14.0 > > > Address after ARROW-3144. The current code presumes the dictionary is the > same for all chunks. We should check if the dictionary is the same for all > chunks, and if not, perform a {{DictionaryType::Unify}} operation and then > write out into the resulting {{CategoricalBlock}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-5335) [Python] Support for converting variable dictionaries to pandas
[ https://issues.apache.org/jira/browse/ARROW-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856293#comment-16856293 ] Wes McKinney commented on ARROW-5335: - I think it would be irresponsible to release without at least adding a check for all dictionaries being the same > [Python] Support for converting variable dictionaries to pandas > --- > > Key: ARROW-5335 > URL: https://issues.apache.org/jira/browse/ARROW-5335 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Wes McKinney >Priority: Blocker > Fix For: 0.14.0 > > > Address after ARROW-3144. The current code presumes the dictionary is the > same for all chunks. We should check if the dictionary is the same for all > chunks, and if not, perform a {{DictionaryType::Unify}} operation and then > write out into the resulting {{CategoricalBlock}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5513) [Java] Refactor method name for getstartOffset to use camel case
[ https://issues.apache.org/jira/browse/ARROW-5513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5513: -- Labels: pull-request-available (was: ) > [Java] Refactor method name for getstartOffset to use camel case > > > Key: ARROW-5513 > URL: https://issues.apache.org/jira/browse/ARROW-5513 > Project: Apache Arrow > Issue Type: Improvement > Components: Java >Reporter: Liya Fan >Assignee: Liya Fan >Priority: Trivial > Labels: pull-request-available > > The method getstartOffset in class > org.apache.arrow.vector.BaseVariableWidthVector should be refactored to > getStartOffset, to comply with the camel case. > Fortunately, this method is not public, so the changes are internal to Arrow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5115) [JS] Implement the Vector Builders
[ https://issues.apache.org/jira/browse/ARROW-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5115: -- Labels: pull-request-available (was: ) > [JS] Implement the Vector Builders > -- > > Key: ARROW-5115 > URL: https://issues.apache.org/jira/browse/ARROW-5115 > Project: Apache Arrow > Issue Type: New Feature > Components: JavaScript >Affects Versions: 0.13.0 >Reporter: Paul Taylor >Assignee: Paul Taylor >Priority: Major > Labels: pull-request-available > > We should implement the streaming Vector Builders in JS. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-5020) [C++][Gandiva] Split Gandiva-related conda packages for builds into separate .yml conda env file
[ https://issues.apache.org/jira/browse/ARROW-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-5020. - Resolution: Fixed Issue resolved by pull request 4459 [https://github.com/apache/arrow/pull/4459] > [C++][Gandiva] Split Gandiva-related conda packages for builds into separate > .yml conda env file > > > Key: ARROW-5020 > URL: https://issues.apache.org/jira/browse/ARROW-5020 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Continuous Integration >Reporter: Wes McKinney >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > These installs are large and should not be required unconditionally in CI and > elsewhere -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5512) [C++] Draft initial public APIs for Datasets project
Wes McKinney created ARROW-5512: --- Summary: [C++] Draft initial public APIs for Datasets project Key: ARROW-5512 URL: https://issues.apache.org/jira/browse/ARROW-5512 Project: Apache Arrow Issue Type: New Feature Components: C++ Reporter: Wes McKinney Assignee: Wes McKinney Fix For: 0.14.0 The objective of this is to ensure general alignment with the discussion document https://docs.google.com/document/d/1bVhzifD38qDypnSjtf8exvpP3sSB5x_Kw9m-n66FB2c/edit?usp=sharing so that an initial working implementation can begin to take place -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5513) [Java] Refactor method name for getstartOffset to use camel case
Liya Fan created ARROW-5513: --- Summary: [Java] Refactor method name for getstartOffset to use camel case Key: ARROW-5513 URL: https://issues.apache.org/jira/browse/ARROW-5513 Project: Apache Arrow Issue Type: Improvement Components: Java Reporter: Liya Fan Assignee: Liya Fan -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5506) [C++] Generic columnar format functionality
Andrei Gudkov created ARROW-5506: Summary: [C++] Generic columnar format functionality Key: ARROW-5506 URL: https://issues.apache.org/jira/browse/ARROW-5506 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Andrei Gudkov Discussion is here: [https://github.com/apache/arrow/pull/4066] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-5463) [Rust] Implement AsRef for Buffer
[ https://issues.apache.org/jira/browse/ARROW-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved ARROW-5463. - Resolution: Fixed Fix Version/s: 0.14.0 Issue resolved by pull request 4450 [https://github.com/apache/arrow/pull/4450] > [Rust] Implement AsRef for Buffer > - > > Key: ARROW-5463 > URL: https://issues.apache.org/jira/browse/ARROW-5463 > Project: Apache Arrow > Issue Type: New Feature > Components: Rust >Reporter: Renjie Liu >Assignee: Renjie Liu >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Implement AsRef ArrowNativeType for Buffer -- This message was sent by Atlassian JIRA (v7.6.3#76005)