[jira] [Commented] (ARROW-5802) [CI] Dockerize "lint" Travis CI job

2019-07-15 Thread Wes McKinney (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885819#comment-16885819
 ] 

Wes McKinney commented on ARROW-5802:
-

cc [~kszucs] [~fsaintjacques] since you may get further than me on this

> [CI] Dockerize "lint" Travis CI job
> ---
>
> Key: ARROW-5802
> URL: https://issues.apache.org/jira/browse/ARROW-5802
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 1.0.0
>
>
> Run via docker-compose; also enables contributors to lint locally



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5802) [CI] Dockerize "lint" Travis CI job

2019-07-15 Thread Wes McKinney (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885818#comment-16885818
 ] 

Wes McKinney commented on ARROW-5802:
-

I started working on this but I have to give up

https://github.com/wesm/arrow/tree/ARROW-5802

I started with Debian Stretch since there are "slim" images with the JRE from 
the openjdk Docker Hub account, but then I got stuck because we have Python 3.6 
code that needs to be linted, so we'll probably have to put conda in the image.

I'm also not sure how to handle the ARROW_CI_*_AFFECTED environment variables 
-- should we just always run the checks?

As a last matter, running hadolint without docker seems complicated, since you 
can't do docker-in-docker

> [CI] Dockerize "lint" Travis CI job
> ---
>
> Key: ARROW-5802
> URL: https://issues.apache.org/jira/browse/ARROW-5802
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 1.0.0
>
>
> Run via docker-compose; also enables contributors to lint locally



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (ARROW-5884) [Java] Fix the get method of StructVector

2019-07-15 Thread Micah Kornfield (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Kornfield resolved ARROW-5884.

   Resolution: Fixed
Fix Version/s: 1.0.0

Issue resolved by pull request 4831
[https://github.com/apache/arrow/pull/4831]

> [Java] Fix the get method of StructVector
> -
>
> Key: ARROW-5884
> URL: https://issues.apache.org/jira/browse/ARROW-5884
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java
>Reporter: Liya Fan
>Assignee: Liya Fan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When the data at the specified location is null, there is no need to call the 
> method from super to set the reader
>  holder.isSet = isSet(index);
>  super.get(index, holder);



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (ARROW-5835) [Java] Support Dictionary Encoding for binary type

2019-07-15 Thread Micah Kornfield (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Kornfield resolved ARROW-5835.

   Resolution: Fixed
Fix Version/s: 1.0.0

Issue resolved by pull request 4792
[https://github.com/apache/arrow/pull/4792]

> [Java] Support Dictionary Encoding for binary type
> --
>
> Key: ARROW-5835
> URL: https://issues.apache.org/jira/browse/ARROW-5835
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Java
>Reporter: Ji Liu
>Assignee: Ji Liu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Now is not implemented because byte array is not supported to be HashMap key.
> One possible way is that wrap them with something to implement equals and 
> hashcode.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5956) [R] Undefined symbol GetFieldByName

2019-07-15 Thread Jeffrey Wong (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Wong updated ARROW-5956:

Summary: [R] Undefined symbol GetFieldByName  (was: Undefined symbol 
GetFieldByName)

> [R] Undefined symbol GetFieldByName
> ---
>
> Key: ARROW-5956
> URL: https://issues.apache.org/jira/browse/ARROW-5956
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
> Environment: Ubuntu 16.04, R 3.4.4, python 3.6.5
>Reporter: Jeffrey Wong
>Priority: Major
>
> I have installed pyarrow 0.14.0 and want to be able to also use R arrow. In 
> my work I use rpy2 a lot to exchange python data structures with R data 
> structures, so would like R arrow to link against the exact same .so files 
> found in pyarrow
>  
>  
> When I pass in include_dir and lib_dir to R's configure, pointing to 
> pyarrow's include and pyarrow's root directories, I am able to compile R's 
> arrow.so file. However, I am unable to load it in an R session, getting the 
> error:
>  
> {code:java}
> > dyn.load('arrow.so')
> Error in dyn.load("arrow.so") :
>  unable to load shared object '/tmp/arrow2/r/src/arrow.so':
>  /tmp/arrow2/r/src/arrow.so: undefined symbol: 
> _ZNK5arrow11StructArray14GetFieldByNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE{code}
>  
>  
> Steps to reproduce:
>  
> Install pyarrow, which also ships libarrow.so and libparquet.so
>  
> {code:java}
> pip3 install pyarrow --upgrade --user
> PY_ARROW_PATH=$(python3 -c "import pyarrow, os; 
> print(os.path.dirname(pyarrow.__file__))")
> PY_ARROW_VERSION=$(python3 -c "import pyarrow; print(pyarrow.__version__)")
> ln -s $PY_ARROW_PATH/libarrow.so.14 $PY_ARROW_PATH/libarrow.so
> ln -s $PY_ARROW_PATH/libparquet.so.14 $PY_ARROW_PATH/libparquet.so
> {code}
>  
>  
> Add to LD_LIBRARY_PATH
>  
> {code:java}
> sudo tee -a /usr/lib/R/etc/ldpaths < LD_LIBRARY_PATH="\${LD_LIBRARY_PATH}:$PY_ARROW_PATH"
> export LD_LIBRARY_PATH
> LINES
> sudo tee -a /usr/lib/rstudio-server/bin/r-ldpath < LD_LIBRARY_PATH="\${LD_LIBRARY_PATH}:$PY_ARROW_PATH"
> export LD_LIBRARY_PATH
> LINES
> export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:$PY_ARROW_PATH"
> {code}
>  
>  
> Install r arrow from source
> {code:java}
> git clone https://github.com/apache/arrow.git /tmp/arrow2
> cd /tmp/arrow2/r
> git checkout tags/apache-arrow-0.14.0
> R CMD INSTALL ./ --configure-vars="INCLUDE_DIR=$PY_ARROW_PATH/include 
> LIB_DIR=$PY_ARROW_PATH"{code}
>  
> I have noticed that the R package for arrow no longer has an RcppExports, but 
> instead an arrowExports. Could it be that the lack of RcppExports has made it 
> difficult to find GetFieldByName?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5955) [Plasma] Support setting memory quotas per plasma client for better isolation

2019-07-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-5955:
--
Labels: pull-request-available  (was: )

> [Plasma] Support setting memory quotas per plasma client for better isolation
> -
>
> Key: ARROW-5955
> URL: https://issues.apache.org/jira/browse/ARROW-5955
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: Eric Liang
>Priority: Major
>  Labels: pull-request-available
>
> Currently, plasma evicts objects according a global LRU queue. In Ray, this 
> often causes memory-intensive workloads to fail unpredictably, since a client 
> that creates objects at a high rate can evict objects created by clients at 
> lower rates. This is despite the fact that the true working set of both 
> clients may be quite small.
> cc [~pcmoritz]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-5956) Undefined symbol GetFieldByName

2019-07-15 Thread Jeffrey Wong (JIRA)
Jeffrey Wong created ARROW-5956:
---

 Summary: Undefined symbol GetFieldByName
 Key: ARROW-5956
 URL: https://issues.apache.org/jira/browse/ARROW-5956
 Project: Apache Arrow
  Issue Type: Bug
  Components: R
 Environment: Ubuntu 16.04, R 3.4.4, python 3.6.5
Reporter: Jeffrey Wong


I have installed pyarrow 0.14.0 and want to be able to also use R arrow. In my 
work I use rpy2 a lot to exchange python data structures with R data 
structures, so would like R arrow to link against the exact same .so files 
found in pyarrow

 

 

When I pass in include_dir and lib_dir to R's configure, pointing to pyarrow's 
include and pyarrow's root directories, I am able to compile R's arrow.so file. 
However, I am unable to load it in an R session, getting the error:




 
{code:java}
> dyn.load('arrow.so')
Error in dyn.load("arrow.so") :
 unable to load shared object '/tmp/arrow2/r/src/arrow.so':
 /tmp/arrow2/r/src/arrow.so: undefined symbol: 
_ZNK5arrow11StructArray14GetFieldByNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE{code}
 

 

Steps to reproduce:

 

Install pyarrow, which also ships libarrow.so and libparquet.so

 
{code:java}
pip3 install pyarrow --upgrade --user
PY_ARROW_PATH=$(python3 -c "import pyarrow, os; 
print(os.path.dirname(pyarrow.__file__))")
PY_ARROW_VERSION=$(python3 -c "import pyarrow; print(pyarrow.__version__)")
ln -s $PY_ARROW_PATH/libarrow.so.14 $PY_ARROW_PATH/libarrow.so
ln -s $PY_ARROW_PATH/libparquet.so.14 $PY_ARROW_PATH/libparquet.so
{code}
 

 

Add to LD_LIBRARY_PATH

 
{code:java}
sudo tee -a /usr/lib/R/etc/ldpaths 

[jira] [Created] (ARROW-5955) [Plasma] Support setting memory quotas per plasma client for better isolation

2019-07-15 Thread Eric Liang (JIRA)
Eric Liang created ARROW-5955:
-

 Summary: [Plasma] Support setting memory quotas per plasma client 
for better isolation
 Key: ARROW-5955
 URL: https://issues.apache.org/jira/browse/ARROW-5955
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Eric Liang


Currently, plasma evicts objects according a global LRU queue. In Ray, this 
often causes memory-intensive workloads to fail unpredictably, since a client 
that creates objects at a high rate can evict objects created by clients at 
lower rates. This is despite the fact that the true working set of both clients 
may be quite small.

cc [~pcmoritz]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5953) Thrift download ERRORS with apache-arrow-0.14.0

2019-07-15 Thread Brian (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1688#comment-1688
 ] 

Brian commented on ARROW-5953:
--

This appears to be problem local to two different Linux systems I'm using here 
at SAS to build the arrow source tree.  I'll so some more sleuthing and report 
back in case what I find can help others.

> Thrift download ERRORS with apache-arrow-0.14.0
> ---
>
> Key: ARROW-5953
> URL: https://issues.apache.org/jira/browse/ARROW-5953
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.13.0, 0.14.0
> Environment: RHEL 6.7
>Reporter: Brian
>Priority: Major
>
> {color:#33}cmake returns:{color}
> requests.excetions.SSLError: hostname 'www.apache.org' doesn't match either 
> of '*.openoffice.org', 'openoffice.org'/thrift/0.12.0/thrift-0.12.0.tar.gz
> {color:#33}during check for thrift download location.  {color}
> {color:#33}This occurs with a freshly inflated arrow source release tree 
> where cmake is running for the first time. {color}
> {color:#33}Reproducible with the release levels of apache-arrow-0.14.0 
> and  0.13.0.  I tried this 3-5x on 15Jul2019 and see it consistently each 
> time.{color}
> {color:#33}Here's the full context from cmake output: {color}
> {quote}-- Checking for module 'thrift'
> --   No package 'thrift' found
> -- Could NOT find Thrift (missing: THRIFT_STATIC_LIB THRIFT_INCLUDE_DIR 
> THRIFT_COMPILER)
> Building Apache Thrift from source
> Downloading Apache Thrift from Traceback (most recent call last):
>   File "…/apache-arrow-0.14.0/cpp/build-support/get_apache_mirror.py", line 
> 38, in 
>     suggested_mirror = get_url('[https://www.apache.org/dyn/]'
>   File "…/apache-arrow-0.14.0/cpp/build-support/get_apache_mirror.py", line 
> 27, in get_url
>     return requests.get(url).content
>   File "/usr/lib/python2.6/site-packages/requests/api.py", line 68, in get
>     return request('get', url, **kwargs)
>   File "/usr/lib/python2.6/site-packages/requests/api.py", line 50, in request
>     response = session.request(method=method, url=url, **kwargs)
>   File "/usr/lib/python2.6/site-packages/requests/sessions.py", line 464, in 
> request
>     resp = self.send(prep, **send_kwargs)
>   File "/usr/lib/python2.6/site-packages/requests/sessions.py", line 576, in 
> send
>     r = adapter.send(request, **kwargs)
>   File "/usr/lib/python2.6/site-packages/requests/adapters.py", line 431, in 
> send
>     raise SSLError(e, request=request)
> requests.exceptions.SSLError: hostname 'www.apache.org' doesn't match either 
> of '*.openoffice.org', 'openoffice.org'/thrift/0.12.0/thrift-0.12.0.tar.gz
> {quote}
> {color:#FF} {color}
> {color:#FF}{color:#33}Per Wes' suggestion I ran the following 
> directly:{color}{color}
> {color:#FF}{color:#33}python cpp/build-support/get_apache_mirror.py 
> [https://www-eu.apache.org/dist/] [http://us.mirrors.quenda.co/apache/]
> {color}{color}
> {color:#FF}{color:#33}with this output:{color}{color}
> [https://www-eu.apache.org/dist/]  [http://us.mirrors.quenda.co/apache/]
>  
>  
> *NOTE:* here are the cmake thrift log lines from a build of apache-arrow git 
> clone on 06Jul2019 where cmake/make were run fine.pwd
>  
> {quote}-- Checking for module 'thrift'
> -- No package 'thrift' found
> -- Could NOT find Thrift (missing: THRIFT_STATIC_LIB) 
> Building Apache Thrift from source
> Downloading Apache Thrift from 
> http://mirror.metrocast.net/apache//thrift/0.12.0/thrift-0.12.0.tar.gz
> {quote}
> Currently, cmake runs successfully on this apache-arrow-0.14.0 directory.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-5954) Organize source and binary dependency licenses into directories

2019-07-15 Thread Krisztian Szucs (JIRA)
Krisztian Szucs created ARROW-5954:
--

 Summary: Organize source and binary dependency licenses into 
directories
 Key: ARROW-5954
 URL: https://issues.apache.org/jira/browse/ARROW-5954
 Project: Apache Arrow
  Issue Type: Task
Reporter: Krisztian Szucs
 Fix For: 1.0.0


Similarly like Spark does, see comment 
https://github.com/apache/arrow/pull/4880/files/b839964a2a43123991b5b291607ff1cb026fe8a4#diff-61e0bdf7e1b43c5c93d9488b22e04170



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-5953) Thrift download ERRORS with apache-arrow-0.14.0

2019-07-15 Thread Brian (JIRA)
Brian created ARROW-5953:


 Summary: Thrift download ERRORS with apache-arrow-0.14.0
 Key: ARROW-5953
 URL: https://issues.apache.org/jira/browse/ARROW-5953
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.14.0, 0.13.0
 Environment: RHEL 6.7
Reporter: Brian


{color:#33}cmake returns:{color}

requests.excetions.SSLError: hostname 'www.apache.org' doesn't match either of 
'*.openoffice.org', 'openoffice.org'/thrift/0.12.0/thrift-0.12.0.tar.gz

{color:#33}during check for thrift download location.  {color}

{color:#33}This occurs with a freshly inflated arrow source release tree 
where cmake is running for the first time. {color}

{color:#33}Reproducible with the release levels of apache-arrow-0.14.0 and  
0.13.0.  I tried this 3-5x on 15Jul2019 and see it consistently each 
time.{color}

{color:#33}Here's the full context from cmake output: {color}
{quote}-- Checking for module 'thrift'

--   No package 'thrift' found

-- Could NOT find Thrift (missing: THRIFT_STATIC_LIB THRIFT_INCLUDE_DIR 
THRIFT_COMPILER)

Building Apache Thrift from source

Downloading Apache Thrift from Traceback (most recent call last):

  File "…/apache-arrow-0.14.0/cpp/build-support/get_apache_mirror.py", line 38, 
in 

    suggested_mirror = get_url('[https://www.apache.org/dyn/]'

  File "…/apache-arrow-0.14.0/cpp/build-support/get_apache_mirror.py", line 27, 
in get_url

    return requests.get(url).content

  File "/usr/lib/python2.6/site-packages/requests/api.py", line 68, in get

    return request('get', url, **kwargs)

  File "/usr/lib/python2.6/site-packages/requests/api.py", line 50, in request

    response = session.request(method=method, url=url, **kwargs)

  File "/usr/lib/python2.6/site-packages/requests/sessions.py", line 464, in 
request

    resp = self.send(prep, **send_kwargs)

  File "/usr/lib/python2.6/site-packages/requests/sessions.py", line 576, in 
send

    r = adapter.send(request, **kwargs)

  File "/usr/lib/python2.6/site-packages/requests/adapters.py", line 431, in 
send

    raise SSLError(e, request=request)

requests.exceptions.SSLError: hostname 'www.apache.org' doesn't match either of 
'*.openoffice.org', 'openoffice.org'/thrift/0.12.0/thrift-0.12.0.tar.gz
{quote}
{color:#FF} {color}

{color:#FF}{color:#33}Per Wes' suggestion I ran the following 
directly:{color}{color}

{color:#FF}{color:#33}python cpp/build-support/get_apache_mirror.py 
[https://www-eu.apache.org/dist/] [http://us.mirrors.quenda.co/apache/]
{color}{color}

{color:#FF}{color:#33}with this output:{color}{color}

[https://www-eu.apache.org/dist/]  [http://us.mirrors.quenda.co/apache/]

 

 

*NOTE:* here are the cmake thrift log lines from a build of apache-arrow git 
clone on 06Jul2019 where cmake/make were run fine.pwd

 
{quote}-- Checking for module 'thrift'
-- No package 'thrift' found
-- Could NOT find Thrift (missing: THRIFT_STATIC_LIB) 
Building Apache Thrift from source
Downloading Apache Thrift from 
http://mirror.metrocast.net/apache//thrift/0.12.0/thrift-0.12.0.tar.gz
{quote}
Currently, cmake runs successfully on this apache-arrow-0.14.0 directory.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (ARROW-5934) [Python] Bundle arrow's LICENSE with the wheels

2019-07-15 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-5934.
-
Resolution: Fixed

Issue resolved by pull request 4880
[https://github.com/apache/arrow/pull/4880]

> [Python] Bundle arrow's LICENSE with the wheels
> ---
>
> Key: ARROW-5934
> URL: https://issues.apache.org/jira/browse/ARROW-5934
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0, 0.14.1
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Guide to bundle LICENSE files with the wheels: 
> https://wheel.readthedocs.io/en/stable/user_guide.html#including-license-files-in-the-generated-wheel-file
> We also need to ensure, that all thirdparty dependencies' license are 
> attached to it, especially because we're statically linking multiple 3rdparty 
> dependencies, and for example uriparser is missing from the LICENSE file.
> cc [~wesmckinn]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (ARROW-5934) [Python] Bundle arrow's LICENSE with the wheels

2019-07-15 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-5934:
---

Assignee: Krisztian Szucs

> [Python] Bundle arrow's LICENSE with the wheels
> ---
>
> Key: ARROW-5934
> URL: https://issues.apache.org/jira/browse/ARROW-5934
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0, 0.14.1
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Guide to bundle LICENSE files with the wheels: 
> https://wheel.readthedocs.io/en/stable/user_guide.html#including-license-files-in-the-generated-wheel-file
> We also need to ensure, that all thirdparty dependencies' license are 
> attached to it, especially because we're statically linking multiple 3rdparty 
> dependencies, and for example uriparser is missing from the LICENSE file.
> cc [~wesmckinn]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-5952) [Python] Segfault when reading empty table with category as pandas dataframe

2019-07-15 Thread Daniel Nugent (JIRA)
Daniel Nugent created ARROW-5952:


 Summary: [Python] Segfault when reading empty table with category 
as pandas dataframe
 Key: ARROW-5952
 URL: https://issues.apache.org/jira/browse/ARROW-5952
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 0.14.0
 Environment: Linux 3.10.0-327.36.3.el7.x86_64
Python 3.6.8
Pandas 0.24.2
Pyarrow 0.14.0

Reporter: Daniel Nugent


I have two short sample programs which demonstrate the issue:
{code:java}
import pyarrow as pa
import pandas as pd
empty = pd.DataFrame({'foo':[]},dtype='category')
table = pa.Table.from_pandas(empty)
outfile = pa.output_stream('bar')
writer = pa.RecordBatchFileWriter(outfile,table.schema)
writer.write(table)
writer.close()
{code}
{code:java}
import pyarrow as pa
pa.ipc.open_file('bar').read_pandas()
Segmentation fault
{code}

My apologies if this was already reported elsewhere, I searched but could not 
find an issue which seemed to refer to the same behavior.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5894) [C++] libgandiva.so.14 is exporting libstdc++ symbols

2019-07-15 Thread Zhuo Peng (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885414#comment-16885414
 ] 

Zhuo Peng commented on ARROW-5894:
--

[https://github.com/apache/arrow/pull/4883]

> [C++] libgandiva.so.14 is exporting libstdc++ symbols
> -
>
> Key: ARROW-5894
> URL: https://issues.apache.org/jira/browse/ARROW-5894
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++ - Gandiva
>Affects Versions: 0.14.0
>Reporter: Zhuo Peng
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For example:
> $ nm libgandiva.so.14 | grep "once_proxy"
> 018c0a10 T __once_proxy
>  
> many other symbols are also exported which I guess shouldn't be (e.g. LLVM 
> symbols)
>  
> There seems to be no linker script for libgandiva.so (there was, but was 
> never used and got deleted? 
> [https://github.com/apache/arrow/blob/9265fe35b67db93f5af0b47e92e039c637ad5b3e/cpp/src/gandiva/symbols-helpers.map]).
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5894) [C++] libgandiva.so.14 is exporting libstdc++ symbols

2019-07-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-5894:
--
Labels: pull-request-available  (was: )

> [C++] libgandiva.so.14 is exporting libstdc++ symbols
> -
>
> Key: ARROW-5894
> URL: https://issues.apache.org/jira/browse/ARROW-5894
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++ - Gandiva
>Affects Versions: 0.14.0
>Reporter: Zhuo Peng
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> For example:
> $ nm libgandiva.so.14 | grep "once_proxy"
> 018c0a10 T __once_proxy
>  
> many other symbols are also exported which I guess shouldn't be (e.g. LLVM 
> symbols)
>  
> There seems to be no linker script for libgandiva.so (there was, but was 
> never used and got deleted? 
> [https://github.com/apache/arrow/blob/9265fe35b67db93f5af0b47e92e039c637ad5b3e/cpp/src/gandiva/symbols-helpers.map]).
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-5951) [Python][Wheel] Request UCS4 wheels in the packaging tasks

2019-07-15 Thread Krisztian Szucs (JIRA)
Krisztian Szucs created ARROW-5951:
--

 Summary: [Python][Wheel] Request UCS4 wheels in the packaging 
tasks 
 Key: ARROW-5951
 URL: https://issues.apache.org/jira/browse/ARROW-5951
 Project: Apache Arrow
  Issue Type: Task
  Components: Python
Reporter: Krisztian Szucs


{code}
[root@0b415e11a9ba multibuild]# cpython_path 2.7 16
/opt/python/cp27-cp27m
[root@0b415e11a9ba multibuild]# cpython_path 2.7 32
/opt/python/cp27-cp27mu
[root@0b415e11a9ba multibuild]# cpython_path 3.7 16
/opt/python/cp37-cp37m
[root@0b415e11a9ba multibuild]# cpython_path 3.7 32
/opt/python/cp37-cp37m
{code}
We should actually change the unicode_with properties in the tasks.yml to 32 
for python wheels > 2.7, because we'll always produce ucs4 wheels, and it is 
quite misleading that we request 16 bit ones.

Multibuild does the conversion 
https://github.com/matthew-brett/multibuild/blob/devel/manylinux_utils.sh#L26



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5716) [Developer] Improve merge PR script to acknowledge co-authors

2019-07-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-5716:
--
Labels: pull-request-available  (was: )

> [Developer] Improve merge PR script to acknowledge co-authors
> -
>
> Key: ARROW-5716
> URL: https://issues.apache.org/jira/browse/ARROW-5716
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Developer Tools
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> The Apache Spark PR merge tool supports lead/co-author acknowledgement that 
> shows up in the GitHub UI, we should try to follow this
> https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py
> example commit 
> https://github.com/apache/spark/commit/5a7aa6f4df925bf44267f58a8930b93a4e19c4f4



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (ARROW-5716) [Developer] Improve merge PR script to acknowledge co-authors

2019-07-15 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-5716:
---

Assignee: Wes McKinney

> [Developer] Improve merge PR script to acknowledge co-authors
> -
>
> Key: ARROW-5716
> URL: https://issues.apache.org/jira/browse/ARROW-5716
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Developer Tools
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 1.0.0
>
>
> The Apache Spark PR merge tool supports lead/co-author acknowledgement that 
> shows up in the GitHub UI, we should try to follow this
> https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py
> example commit 
> https://github.com/apache/spark/commit/5a7aa6f4df925bf44267f58a8930b93a4e19c4f4



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5610) [Python] Define extension type API in Python to "receive" or "send" a foreign extension type

2019-07-15 Thread lidavidm (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885281#comment-16885281
 ] 

lidavidm commented on ARROW-5610:
-

[~wesmckinn] I'll try to take a pass this week, if time permits; we would like 
this functionality. (By the way, is there a Jira explicitly for being able to 
hook into to_pandas, or a suggested way to efficiently do a custom Pandas 
conversion?)

> [Python] Define extension type API in Python to "receive" or "send" a foreign 
> extension type
> 
>
> Key: ARROW-5610
> URL: https://issues.apache.org/jira/browse/ARROW-5610
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 1.0.0
>
>
> In work in ARROW-840, a static {{arrow.py_extension_type}} name is used. 
> There will be cases where an extension type is coming from another 
> programming language (e.g. Java), so it would be useful to be able to "plug 
> in" a Python extension type subclass that will be used to deserialize the 
> extension type coming over the wire. This has some different API requirements 
> since the serialized representation of the type will not have knowledge of 
> Python pickling, etc. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5856) [Python] linking 3rd party cython modules against pyarrow fails since 0.14.0

2019-07-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-5856:
--
Labels: pull-request-available  (was: )

> [Python] linking 3rd party cython modules against pyarrow fails since 0.14.0
> 
>
> Key: ARROW-5856
> URL: https://issues.apache.org/jira/browse/ARROW-5856
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.14.0
>Reporter: Steve Stagg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0, 0.14.1
>
> Attachments: setup.py, test.pyx
>
>
> Compiling cython modules that link to the pyarrow library, using the 
> recommended approach for getting the appropriate include and link flags has 
> stopped working for PyArrow 0.14.0.
>  
> A minimal test case is included in the attachments that demonstrates the 
> problem.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (ARROW-5946) [Rust] [DataFusion] Projection push down with aggregate producing incorrect results

2019-07-15 Thread Andy Grove (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Grove resolved ARROW-5946.
---
   Resolution: Fixed
Fix Version/s: (was: 1.0.0)
   0.14.1

Issue resolved by pull request 4878
[https://github.com/apache/arrow/pull/4878]

> [Rust] [DataFusion] Projection push down with aggregate producing incorrect 
> results
> ---
>
> Key: ARROW-5946
> URL: https://issues.apache.org/jira/browse/ARROW-5946
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Rust, Rust - DataFusion
>Affects Versions: 0.14.0
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.1
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I was testing some queries with the 0.14 release and noticed that the 
> projected schema for a table scan is completely wrong (however the results of 
> the query are not necessarily wrong)
>  
> {code:java}
> // schema for nyxtaxi csv files
> let schema = Schema::new(vec![
> Field::new("VendorID", DataType::Utf8, true),
> Field::new("tpep_pickup_datetime", DataType::Utf8, true),
> Field::new("tpep_dropoff_datetime", DataType::Utf8, true),
> Field::new("passenger_count", DataType::Utf8, true),
> Field::new("trip_distance", DataType::Float64, true),
> Field::new("RatecodeID", DataType::Utf8, true),
> Field::new("store_and_fwd_flag", DataType::Utf8, true),
> Field::new("PULocationID", DataType::Utf8, true),
> Field::new("DOLocationID", DataType::Utf8, true),
> Field::new("payment_type", DataType::Utf8, true),
> Field::new("fare_amount", DataType::Float64, true),
> Field::new("extra", DataType::Float64, true),
> Field::new("mta_tax", DataType::Float64, true),
> Field::new("tip_amount", DataType::Float64, true),
> Field::new("tolls_amount", DataType::Float64, true),
> Field::new("improvement_surcharge", DataType::Float64, true),
> Field::new("total_amount", DataType::Float64, true),
> ]);
> let mut ctx = ExecutionContext::new();
> ctx.register_csv("tripdata", "file.csv", , true);
> let optimized_plan = ctx.create_logical_plan(
> "SELECT passenger_count, MIN(fare_amount), MAX(fare_amount) \
> FROM tripdata GROUP BY passenger_count").unwrap();{code}
>  The projected schema in the table scan has the first two columns from the 
> schema (VendorID and tpetp_pickup_datetime) rather than passenger_count and 
> fare_amount



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5949) [Rust] Implement DictionaryArray

2019-07-15 Thread Wes McKinney (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885228#comment-16885228
 ] 

Wes McKinney commented on ARROW-5949:
-

I'd recommend looking at what we've done in C++, the implementation and usage 
is fairly mature there

> [Rust] Implement DictionaryArray
> 
>
> Key: ARROW-5949
> URL: https://issues.apache.org/jira/browse/ARROW-5949
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Rust
>Reporter: David Atienza
>Assignee: Andy Grove
>Priority: Major
>
> I am pretty new to the codebase, but I have seen that DictionaryArray is not 
> implemented in the Rust implementation.
> I went through the list of issues and I could not see any work on this. Is 
> there any blocker?
>  
> The specification is a bit 
> [short|https://arrow.apache.org/docs/format/Layout.html#dictionary-encoding] 
> or even 
> [non-existant|https://arrow.apache.org/docs/format/Metadata.html#dictionary-encoding],
>  so I am not sure how to implement it myself.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5949) [Rust] Implement DictionaryArray

2019-07-15 Thread Andy Grove (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885214#comment-16885214
 ] 

Andy Grove commented on ARROW-5949:
---

I'm not aware of any blockers. I expect this is just a case of nobody needing 
the feature yet.

> [Rust] Implement DictionaryArray
> 
>
> Key: ARROW-5949
> URL: https://issues.apache.org/jira/browse/ARROW-5949
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Rust
>Reporter: David Atienza
>Assignee: Andy Grove
>Priority: Major
>
> I am pretty new to the codebase, but I have seen that DictionaryArray is not 
> implemented in the Rust implementation.
> I went through the list of issues and I could not see any work on this. Is 
> there any blocker?
>  
> The specification is a bit 
> [short|https://arrow.apache.org/docs/format/Layout.html#dictionary-encoding] 
> or even 
> [non-existant|https://arrow.apache.org/docs/format/Metadata.html#dictionary-encoding],
>  so I am not sure how to implement it myself.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (ARROW-5949) [Rust] Implement DictionaryArray

2019-07-15 Thread Andy Grove (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Grove reassigned ARROW-5949:
-

Assignee: Andy Grove

> [Rust] Implement DictionaryArray
> 
>
> Key: ARROW-5949
> URL: https://issues.apache.org/jira/browse/ARROW-5949
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Rust
>Reporter: David Atienza
>Assignee: Andy Grove
>Priority: Major
>
> I am pretty new to the codebase, but I have seen that DictionaryArray is not 
> implemented in the Rust implementation.
> I went through the list of issues and I could not see any work on this. Is 
> there any blocker?
>  
> The specification is a bit 
> [short|https://arrow.apache.org/docs/format/Layout.html#dictionary-encoding] 
> or even 
> [non-existant|https://arrow.apache.org/docs/format/Metadata.html#dictionary-encoding],
>  so I am not sure how to implement it myself.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-5950) [Rust] [DataFusion] Add logger dependency

2019-07-15 Thread Andy Grove (JIRA)
Andy Grove created ARROW-5950:
-

 Summary: [Rust] [DataFusion] Add logger dependency
 Key: ARROW-5950
 URL: https://issues.apache.org/jira/browse/ARROW-5950
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust, Rust - DataFusion
Reporter: Andy Grove
 Fix For: 1.0.0


It would be nice to be able turn on debug logging at runtime and see how query 
plans are built and optimized. I propose adding a dependency on the log crate.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5853) [Python] Expose boolean filter kernel on Array

2019-07-15 Thread Wes McKinney (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885191#comment-16885191
 ] 

Wes McKinney commented on ARROW-5853:
-

I'd recommend developing with {{-DCMAKE_BUILD_TYPE=debug}}. We intentionally 
aren't including a lot of runtime (in release mode) type checks or other checks 
in the kernels, since the idea is that such type checking should have occurred 
at a higher level

> [Python] Expose boolean filter kernel on Array
> --
>
> Key: ARROW-5853
> URL: https://issues.apache.org/jira/browse/ARROW-5853
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Joris Van den Bossche
>Priority: Major
>
> Expose the filter kernel (https://issues.apache.org/jira/browse/ARROW-1558) 
> on the python Array class.
> Could be done as {{.filter(mask)}} method and/or in {{\_\_getitem\_\_}}.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5934) [Python] Bundle arrow's LICENSE with the wheels

2019-07-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-5934:
--
Labels: pull-request-available  (was: )

> [Python] Bundle arrow's LICENSE with the wheels
> ---
>
> Key: ARROW-5934
> URL: https://issues.apache.org/jira/browse/ARROW-5934
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0, 0.14.1
>
>
> Guide to bundle LICENSE files with the wheels: 
> https://wheel.readthedocs.io/en/stable/user_guide.html#including-license-files-in-the-generated-wheel-file
> We also need to ensure, that all thirdparty dependencies' license are 
> attached to it, especially because we're statically linking multiple 3rdparty 
> dependencies, and for example uriparser is missing from the LICENSE file.
> cc [~wesmckinn]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (ARROW-5925) [Gandiva][C++] cast decimal to int should round up

2019-07-15 Thread Pindikura Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pindikura Ravindra resolved ARROW-5925.
---
   Resolution: Fixed
Fix Version/s: 0.14.1

Issue resolved by pull request 4864
[https://github.com/apache/arrow/pull/4864]

> [Gandiva][C++] cast decimal to int should round up
> --
>
> Key: ARROW-5925
> URL: https://issues.apache.org/jira/browse/ARROW-5925
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Pindikura Ravindra
>Assignee: Pindikura Ravindra
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.1
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5925) [Gandiva][C++] cast decimal to int should round up

2019-07-15 Thread Pindikura Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pindikura Ravindra updated ARROW-5925:
--
Component/s: C++ - Gandiva

> [Gandiva][C++] cast decimal to int should round up
> --
>
> Key: ARROW-5925
> URL: https://issues.apache.org/jira/browse/ARROW-5925
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++ - Gandiva
>Reporter: Pindikura Ravindra
>Assignee: Pindikura Ravindra
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.1
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5934) [Python] Bundle arrow's LICENSE with the wheels

2019-07-15 Thread Krisztian Szucs (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885032#comment-16885032
 ] 

Krisztian Szucs commented on ARROW-5934:


We should bundle both the top level LICENSE.txt and the one under pyarrow:
- https://github.com/apache/arrow/blob/master/LICENSE.txt
- https://github.com/apache/arrow/blob/master/python/LICENSE.txt

Although wheel supports adding multiple license files, they have the same name 
so in order to properly bundle both we either need to rename those or 
concatenate into a single file.


> [Python] Bundle arrow's LICENSE with the wheels
> ---
>
> Key: ARROW-5934
> URL: https://issues.apache.org/jira/browse/ARROW-5934
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Krisztian Szucs
>Priority: Major
> Fix For: 1.0.0, 0.14.1
>
>
> Guide to bundle LICENSE files with the wheels: 
> https://wheel.readthedocs.io/en/stable/user_guide.html#including-license-files-in-the-generated-wheel-file
> We also need to ensure, that all thirdparty dependencies' license are 
> attached to it, especially because we're statically linking multiple 3rdparty 
> dependencies, and for example uriparser is missing from the LICENSE file.
> cc [~wesmckinn]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (ARROW-5919) [R] Add nightly tests for building r-arrow with dependencies from conda-forge

2019-07-15 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-5919.

Resolution: Fixed

Issue resolved by pull request 4855
[https://github.com/apache/arrow/pull/4855]

> [R] Add nightly tests for building r-arrow with dependencies from conda-forge
> -
>
> Key: ARROW-5919
> URL: https://issues.apache.org/jira/browse/ARROW-5919
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration, R
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5919) [R] Add nightly tests for building r-arrow with dependencies from conda-forge

2019-07-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-5919:
--
Labels: pull-request-available  (was: )

> [R] Add nightly tests for building r-arrow with dependencies from conda-forge
> -
>
> Key: ARROW-5919
> URL: https://issues.apache.org/jira/browse/ARROW-5919
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration, R
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5902) [Java] Implement hash table and equals & hashCode API for dictionary encoding

2019-07-15 Thread Ji Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884907#comment-16884907
 ] 

Ji Liu commented on ARROW-5902:
---

cc [~emkornfi...@gmail.com]

> [Java] Implement hash table and equals & hashCode API for dictionary encoding
> -
>
> Key: ARROW-5902
> URL: https://issues.apache.org/jira/browse/ARROW-5902
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Ji Liu
>Assignee: Ji Liu
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> As discussed in [https://github.com/apache/arrow/pull/4792]
> Implement a hash table to only store hash & index, meanwhile add check equal 
> function in ValueVector API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)