[jira] [Commented] (ARROW-1557) [PYTHON] pyarrow.Table.from_arrays doesn't validate names length

2017-09-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173558#comment-16173558
 ] 

ASF GitHub Bot commented on ARROW-1557:
---

Github user wesm commented on the issue:

https://github.com/apache/arrow/pull/1117
  
In case it's useful we have nightly dev builds 



> [PYTHON] pyarrow.Table.from_arrays doesn't validate names length
> 
>
> Key: ARROW-1557
> URL: https://issues.apache.org/jira/browse/ARROW-1557
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Assignee: Tom Augspurger
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> pa.Table.from_arrays doesn't validate that the length of {{arrays}} and 
> {{names}} matches. I think this should raise with a {{ValueError}}:
> {code}
> In [1]: import pyarrow as pa
> In [2]: pa.Table.from_arrays([pa.array([1, 2]), pa.array([3, 4])], 
> names=['a', 'b', 'c'])
> Out[2]:
> pyarrow.Table
> a: int64
> b: int64
> In [3]: pa.__version__
> Out[3]: '0.7.0'
> {code}
> (This is my first time using JIRA, hopefully I didn't mess up too badly)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1557) [PYTHON] pyarrow.Table.from_arrays doesn't validate names length

2017-09-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173550#comment-16173550
 ] 

ASF GitHub Bot commented on ARROW-1557:
---

Github user asfgit closed the pull request at:

https://github.com/apache/arrow/pull/1117


> [PYTHON] pyarrow.Table.from_arrays doesn't validate names length
> 
>
> Key: ARROW-1557
> URL: https://issues.apache.org/jira/browse/ARROW-1557
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Assignee: Tom Augspurger
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> pa.Table.from_arrays doesn't validate that the length of {{arrays}} and 
> {{names}} matches. I think this should raise with a {{ValueError}}:
> {code}
> In [1]: import pyarrow as pa
> In [2]: pa.Table.from_arrays([pa.array([1, 2]), pa.array([3, 4])], 
> names=['a', 'b', 'c'])
> Out[2]:
> pyarrow.Table
> a: int64
> b: int64
> In [3]: pa.__version__
> Out[3]: '0.7.0'
> {code}
> (This is my first time using JIRA, hopefully I didn't mess up too badly)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1557) [PYTHON] pyarrow.Table.from_arrays doesn't validate names length

2017-09-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173060#comment-16173060
 ] 

ASF GitHub Bot commented on ARROW-1557:
---

Github user wesm commented on the issue:

https://github.com/apache/arrow/pull/1117
  
`if not K` is probably better, feel free to make that change too


> [PYTHON] pyarrow.Table.from_arrays doesn't validate names length
> 
>
> Key: ARROW-1557
> URL: https://issues.apache.org/jira/browse/ARROW-1557
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Assignee: Tom Augspurger
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> pa.Table.from_arrays doesn't validate that the length of {{arrays}} and 
> {{names}} matches. I think this should raise with a {{ValueError}}:
> {code}
> In [1]: import pyarrow as pa
> In [2]: pa.Table.from_arrays([pa.array([1, 2]), pa.array([3, 4])], 
> names=['a', 'b', 'c'])
> Out[2]:
> pyarrow.Table
> a: int64
> b: int64
> In [3]: pa.__version__
> Out[3]: '0.7.0'
> {code}
> (This is my first time using JIRA, hopefully I didn't mess up too badly)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1557) [PYTHON] pyarrow.Table.from_arrays doesn't validate names length

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172749#comment-16172749
 ] 

ASF GitHub Bot commented on ARROW-1557:
---

Github user wesm commented on the issue:

https://github.com/apache/arrow/pull/1117
  
here's a fix to cherry pick 
https://github.com/wesm/arrow/commit/965a560867f45025dcbfe50c572593faa7d7cb33


> [PYTHON] pyarrow.Table.from_arrays doesn't validate names length
> 
>
> Key: ARROW-1557
> URL: https://issues.apache.org/jira/browse/ARROW-1557
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Assignee: Tom Augspurger
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> pa.Table.from_arrays doesn't validate that the length of {{arrays}} and 
> {{names}} matches. I think this should raise with a {{ValueError}}:
> {code}
> In [1]: import pyarrow as pa
> In [2]: pa.Table.from_arrays([pa.array([1, 2]), pa.array([3, 4])], 
> names=['a', 'b', 'c'])
> Out[2]:
> pyarrow.Table
> a: int64
> b: int64
> In [3]: pa.__version__
> Out[3]: '0.7.0'
> {code}
> (This is my first time using JIRA, hopefully I didn't mess up too badly)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1557) [PYTHON] pyarrow.Table.from_arrays doesn't validate names length

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172745#comment-16172745
 ] 

ASF GitHub Bot commented on ARROW-1557:
---

Github user wesm commented on the issue:

https://github.com/apache/arrow/pull/1117
  
Appears there is a test failure that was exposed by this patch, can you fix?


> [PYTHON] pyarrow.Table.from_arrays doesn't validate names length
> 
>
> Key: ARROW-1557
> URL: https://issues.apache.org/jira/browse/ARROW-1557
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Assignee: Tom Augspurger
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> pa.Table.from_arrays doesn't validate that the length of {{arrays}} and 
> {{names}} matches. I think this should raise with a {{ValueError}}:
> {code}
> In [1]: import pyarrow as pa
> In [2]: pa.Table.from_arrays([pa.array([1, 2]), pa.array([3, 4])], 
> names=['a', 'b', 'c'])
> Out[2]:
> pyarrow.Table
> a: int64
> b: int64
> In [3]: pa.__version__
> Out[3]: '0.7.0'
> {code}
> (This is my first time using JIRA, hopefully I didn't mess up too badly)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1557) [PYTHON] pyarrow.Table.from_arrays doesn't validate names length

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172608#comment-16172608
 ] 

ASF GitHub Bot commented on ARROW-1557:
---

GitHub user TomAugspurger opened a pull request:

https://github.com/apache/arrow/pull/1117

ARROW-1557 [Python] Validate names length in Table.from_arrays

We now raise a ValueError when the length of the names doesn't match
the length of the arrays.

```python
In [1]: import pyarrow as pa

In [2]: pa.Table.from_arrays([pa.array([1, 2]), pa.array([3, 4])], 
names=['a', 'b', 'c'])
---
ValueErrorTraceback (most recent call last)
 in ()
> 1 pa.Table.from_arrays([pa.array([1, 2]), pa.array([3, 4])], 
names=['a', 'b', 'c'])

table.pxi in pyarrow.lib.Table.from_arrays()

table.pxi in pyarrow.lib._schema_from_arrays()

ValueError: Length of names (3) does not match length of arrays (2)
```

This affected `RecordBatch.from_arrays` and `Table.from_arrays`.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/TomAugspurger/arrow validate-names

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/arrow/pull/1117.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1117


commit ed74d52249fabde739cf0599be0210c818b5d272
Author: Tom Augspurger 
Date:   2017-09-20T01:44:44Z

ARROW-1557 [Python] Validate names length in Table.from_arrays

We now raise a ValueError when the length of the names doesn't match
the length of the arrays.




> [PYTHON] pyarrow.Table.from_arrays doesn't validate names length
> 
>
> Key: ARROW-1557
> URL: https://issues.apache.org/jira/browse/ARROW-1557
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> pa.Table.from_arrays doesn't validate that the length of {{arrays}} and 
> {{names}} matches. I think this should raise with a {{ValueError}}:
> {code}
> In [1]: import pyarrow as pa
> In [2]: pa.Table.from_arrays([pa.array([1, 2]), pa.array([3, 4])], 
> names=['a', 'b', 'c'])
> Out[2]:
> pyarrow.Table
> a: int64
> b: int64
> In [3]: pa.__version__
> Out[3]: '0.7.0'
> {code}
> (This is my first time using JIRA, hopefully I didn't mess up too badly)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1557) [PYTHON] pyarrow.Table.from_arrays doesn't validate names length

2017-09-19 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172019#comment-16172019
 ] 

Wes McKinney commented on ARROW-1557:
-

Agreed! thanks for the bug report

> [PYTHON] pyarrow.Table.from_arrays doesn't validate names length
> 
>
> Key: ARROW-1557
> URL: https://issues.apache.org/jira/browse/ARROW-1557
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Priority: Minor
> Fix For: 0.8.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> pa.Table.from_arrays doesn't validate that the length of {{arrays}} and 
> {{names}} matches. I think this should raise with a {{ValueError}}:
> {code}
> In [1]: import pyarrow as pa
> In [2]: pa.Table.from_arrays([pa.array([1, 2]), pa.array([3, 4])], 
> names=['a', 'b', 'c'])
> Out[2]:
> pyarrow.Table
> a: int64
> b: int64
> In [3]: pa.__version__
> Out[3]: '0.7.0'
> {code}
> (This is my first time using JIRA, hopefully I didn't mess up too badly)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1557) [PYTHON] pyarrow.Table.from_arrays doesn't validate names length

2017-09-19 Thread Tom Augspurger (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172007#comment-16172007
 ] 

Tom Augspurger commented on ARROW-1557:
---

I can probably submit a fix on Thursday or Friday.

> [PYTHON] pyarrow.Table.from_arrays doesn't validate names length
> 
>
> Key: ARROW-1557
> URL: https://issues.apache.org/jira/browse/ARROW-1557
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Priority: Minor
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> pa.Table.from_arrays doesn't validate that the length of {{arrays}} and 
> {{names}} matches. I think this should raise with a {{ValueError}}:
> {{
> In [1]: import pyarrow as pa
> In [2]: pa.Table.from_arrays([pa.array([1, 2]), pa.array([3, 4])], 
> names=['a', 'b', 'c'])
> Out[2]:
> pyarrow.Table
> a: int64
> b: int64
> In [3]: pa.__version__
> Out[3]: '0.7.0'
> }}
> (This is my first time using JIRA, hopefully I didn't mess up too badly)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)