[jira] [Assigned] (ARROW-5809) [Rust] Dockerize (add to docker-compose) Rust Travis CI build

2019-09-27 Thread Andy Grove (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Grove reassigned ARROW-5809:
-

Assignee: Andy Grove

> [Rust] Dockerize (add to docker-compose) Rust Travis CI build
> -
>
> Key: ARROW-5809
> URL: https://issues.apache.org/jira/browse/ARROW-5809
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration, Rust
>Reporter: Wes McKinney
>Assignee: Andy Grove
>Priority: Major
> Fix For: 1.0.0
>
>
> https://github.com/apache/arrow/blob/master/.travis.yml#L306



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6731) [CI] [Rust] Set up Github Action to run Rust tests

2019-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6731:
--
Labels: pull-request-available  (was: )

> [CI] [Rust] Set up Github Action to run Rust tests
> --
>
> Key: ARROW-6731
> URL: https://issues.apache.org/jira/browse/ARROW-6731
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: CI, Rust
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> Set up Github Action to run Rust tests



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6731) [CI] [Rust] Set up Github Action to run Rust tests

2019-09-27 Thread Andy Grove (Jira)
Andy Grove created ARROW-6731:
-

 Summary: [CI] [Rust] Set up Github Action to run Rust tests
 Key: ARROW-6731
 URL: https://issues.apache.org/jira/browse/ARROW-6731
 Project: Apache Arrow
  Issue Type: Improvement
  Components: CI, Rust
Reporter: Andy Grove
Assignee: Andy Grove
 Fix For: 1.0.0


Set up Github Action to run Rust tests



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6625) [Python] Allow concat_tables to null or default fill missing columns

2019-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6625:
--
Labels: pull-request-available  (was: )

> [Python] Allow concat_tables to null or default fill missing columns
> 
>
> Key: ARROW-6625
> URL: https://issues.apache.org/jira/browse/ARROW-6625
> Project: Apache Arrow
>  Issue Type: Wish
>  Components: Python
>Reporter: Daniel Nugent
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> The concat_tables function currently requires schemas to be identical across 
> all tables to be concat'ed together. However, tables occasionally are 
> conforming on type where present, but a column will be absent.
> In this case, allowing for null filling (or default filling) would be ideal.
> I imagine this feature would be an optional parameter on the concat_tables 
> function. Presumably the argument could be either a boolean in the case of 
> blanket null filling, or a mapping type for default filling. If a user wanted 
> to default fill some columns, but null fill others, they could use a None as 
> the value (defaultdict would make it simple to provide a blanket null fill if 
> only a few default value columns were desired).
> If a mapping wasn't present, the function should probably raise an error.
> The default behavior would be the current and thus the default value of the 
> parameter should be False or None.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6728) [C#] Support reading and writing Date32 and Date64 arrays

2019-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6728:
--
Labels: pull-request-available  (was: )

> [C#] Support reading and writing Date32 and Date64 arrays
> -
>
> Key: ARROW-6728
> URL: https://issues.apache.org/jira/browse/ARROW-6728
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C#
>Reporter: Anthony Abate
>Assignee: Anthony Abate
>Priority: Major
>  Labels: pull-request-available
>
> The C# implementation doesn't support reading and writing Date32 and Date64 
> arrays. We need to add support and some tests.
> It looks like it is only a couple of lines to get this enabled. See 
> [https://github.com/apache/arrow/pull/5413].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-6728) [C#] Support reading and writing Date32 and Date64 arrays

2019-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou reassigned ARROW-6728:
---

Assignee: Anthony Abate

> [C#] Support reading and writing Date32 and Date64 arrays
> -
>
> Key: ARROW-6728
> URL: https://issues.apache.org/jira/browse/ARROW-6728
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C#
>Reporter: Anthony Abate
>Assignee: Anthony Abate
>Priority: Major
>
> The C# implementation doesn't support reading and writing Date32 and Date64 
> arrays. We need to add support and some tests.
> It looks like it is only a couple of lines to get this enabled. See 
> [https://github.com/apache/arrow/pull/5413].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6728) [C#] Support reading and writing Date32 and Date64 arrays

2019-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou updated ARROW-6728:

Reporter: Anthony Abate  (was: Eric Erhardt)

> [C#] Support reading and writing Date32 and Date64 arrays
> -
>
> Key: ARROW-6728
> URL: https://issues.apache.org/jira/browse/ARROW-6728
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C#
>Reporter: Anthony Abate
>Priority: Major
>
> The C# implementation doesn't support reading and writing Date32 and Date64 
> arrays. We need to add support and some tests.
> It looks like it is only a couple of lines to get this enabled. See 
> [https://github.com/apache/arrow/pull/5413].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-6714) [R] Fix untested RecordBatchWriter case

2019-09-27 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson resolved ARROW-6714.

Fix Version/s: 1.0.0
   Resolution: Fixed

Issue resolved by pull request 5518
[https://github.com/apache/arrow/pull/5518]

> [R] Fix untested RecordBatchWriter case
> ---
>
> Key: ARROW-6714
> URL: https://issues.apache.org/jira/browse/ARROW-6714
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Neal Richardson
>Assignee: Neal Richardson
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Passing a data.frame to RecordBatchWriter$write() would trigger a segfault



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-6701) [C++][R] Lint failing on R cpp code

2019-09-27 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson resolved ARROW-6701.

Resolution: Fixed

Issue resolved by pull request 5514
[https://github.com/apache/arrow/pull/5514]

> [C++][R] Lint failing on R cpp code
> ---
>
> Key: ARROW-6701
> URL: https://issues.apache.org/jira/browse/ARROW-6701
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Continuous Integration, R
>Reporter: Micah Kornfield
>Assignee: Neal Richardson
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> [See as an example 
> https://travis-ci.org/apache/arrow/jobs/589772132#L695|https://travis-ci.org/apache/arrow/jobs/589772132#L695]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6429) [CI][Crossbow] Nightly spark integration job fails

2019-09-27 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson updated ARROW-6429:
---
Fix Version/s: (was: 0.15.0)
   1.0.0

> [CI][Crossbow] Nightly spark integration job fails
> --
>
> Key: ARROW-6429
> URL: https://issues.apache.org/jira/browse/ARROW-6429
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration
>Reporter: Neal Richardson
>Assignee: Wes McKinney
>Priority: Blocker
>  Labels: nightly, pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> See https://circleci.com/gh/ursa-labs/crossbow/2310. Either fix, skip job and 
> create followup Jira to unskip, or delete job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6532) [R] Write parquet files with compression

2019-09-27 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson updated ARROW-6532:
---
Fix Version/s: (was: 0.15.0)
   1.0.0

> [R] Write parquet files with compression
> 
>
> Key: ARROW-6532
> URL: https://issues.apache.org/jira/browse/ARROW-6532
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: R
>Reporter: Neal Richardson
>Assignee: Romain Francois
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Followup to ARROW-6360. See ARROW-6216 for the C++ side. `write_parquet()` 
> should be able to write compressed files, including with a specified 
> compression level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6716) [CI] [Rust] New 1.40.0 nightly causing builds to fail

2019-09-27 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson updated ARROW-6716:
---
Fix Version/s: (was: 0.15.0)
   1.0.0

> [CI] [Rust] New 1.40.0 nightly causing builds to fail
> -
>
> Key: ARROW-6716
> URL: https://issues.apache.org/jira/browse/ARROW-6716
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: CI, Rust
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> So much for pinning the nightly version ... that doesn't work when there is a 
> new major version of a nightly apparently.
> Travis is now using:
> {code:java}
> rustc 1.40.0-nightly (37538aa13 2019-09-25) {code}
> Despite rust-toolchain containing:
> {code:java}
> nightly-2019-07-30 {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6606) [C++] Construct tree structure from std::vector

2019-09-27 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson updated ARROW-6606:
---
Fix Version/s: (was: 0.15.0)
   1.0.0

> [C++] Construct tree structure from std::vector
> --
>
> Key: ARROW-6606
> URL: https://issues.apache.org/jira/browse/ARROW-6606
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Francois Saint-Jacques
>Assignee: Francois Saint-Jacques
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> This will be used by FileSystemDataSource for pushdown predicate pruning of 
> branches.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-3808) [R] Implement [.arrow::Array

2019-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-3808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3808:
--
Labels: pull-request-available  (was: )

> [R] Implement [.arrow::Array
> 
>
> Key: ARROW-3808
> URL: https://issues.apache.org/jira/browse/ARROW-3808
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: R
>Reporter: Romain Francois
>Assignee: Neal Richardson
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6730) [CI] Use Github Actions for "C++ with clang 7" docker image

2019-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6730:
--
Labels: pull-request-available  (was: )

> [CI] Use Github Actions for "C++ with clang 7" docker image
> ---
>
> Key: ARROW-6730
> URL: https://issues.apache.org/jira/browse/ARROW-6730
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Continuous Integration
>Reporter: Francois Saint-Jacques
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6730) [CI] Use Github Actions for "C++ with clang 7" docker image

2019-09-27 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-6730:
-

 Summary: [CI] Use Github Actions for "C++ with clang 7" docker 
image
 Key: ARROW-6730
 URL: https://issues.apache.org/jira/browse/ARROW-6730
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Continuous Integration
Reporter: Francois Saint-Jacques






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-6532) [R] Write parquet files with compression

2019-09-27 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson resolved ARROW-6532.

Fix Version/s: (was: 1.0.0)
   0.15.0
   Resolution: Fixed

Issue resolved by pull request 5451
[https://github.com/apache/arrow/pull/5451]

> [R] Write parquet files with compression
> 
>
> Key: ARROW-6532
> URL: https://issues.apache.org/jira/browse/ARROW-6532
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: R
>Reporter: Neal Richardson
>Assignee: Romain Francois
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Followup to ARROW-6360. See ARROW-6216 for the C++ side. `write_parquet()` 
> should be able to write compressed files, including with a specified 
> compression level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6711) [C++] Consolidate Filter and Expression classes

2019-09-27 Thread Ben Kietzman (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Kietzman updated ARROW-6711:

Description: 
There is unnecessary boilerplate required when using the Filter/Expression 
classes. Filter is no longer necessary; it (and FilterVector) can be replaced 
with Expression. Expression is sufficiently general that it can be subclassed 
to provide any custom functionality which would have been added through a 
GenericFilter (add some tests for this).

Additionally rows within RecordBatches yielded from a scan are not currently 
filtered using Expression::Evaluate(). (Add tests ensuring both row filtering 
and pruning obey Kleene logic)

Add some comments on the mechanism of {{Assume()}} too, and refactor it not to 
return a Result (its failure modes are covered by {{Validate()}})

  was:
There is unnecessary boilerplate required when using the Filter/Expression 
classes. Filter is no longer necessary; it (and FilterVector) can be replaced 
with Expression. Expression is sufficiently general that it can be subclassed 
to provide any custom functionality which would have been added through a 
GenericFilter (add some tests for this).

Additionally rows within RecordBatches yielded from a scan are not currently 
filtered using Expression::Evaluate(). (Add tests ensuring both row filtering 
and pruning obey Kleene logic)

Add some comments on the mechanism of {{Assume()}} too


> [C++] Consolidate Filter and Expression classes
> ---
>
> Key: ARROW-6711
> URL: https://issues.apache.org/jira/browse/ARROW-6711
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Ben Kietzman
>Assignee: Ben Kietzman
>Priority: Major
>  Labels: dataset
> Fix For: 1.0.0
>
>
> There is unnecessary boilerplate required when using the Filter/Expression 
> classes. Filter is no longer necessary; it (and FilterVector) can be replaced 
> with Expression. Expression is sufficiently general that it can be subclassed 
> to provide any custom functionality which would have been added through a 
> GenericFilter (add some tests for this).
> Additionally rows within RecordBatches yielded from a scan are not currently 
> filtered using Expression::Evaluate(). (Add tests ensuring both row filtering 
> and pruning obey Kleene logic)
> Add some comments on the mechanism of {{Assume()}} too, and refactor it not 
> to return a Result (its failure modes are covered by {{Validate()}})



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6711) [C++] Consolidate Filter and Expression classes

2019-09-27 Thread Ben Kietzman (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Kietzman updated ARROW-6711:

Description: 
There is unnecessary boilerplate required when using the Filter/Expression 
classes. Filter is no longer necessary; it (and FilterVector) can be replaced 
with Expression. Expression is sufficiently general that it can be subclassed 
to provide any custom functionality which would have been added through a 
GenericFilter (add some tests for this).

Additionally rows within RecordBatches yielded from a scan are not currently 
filtered using Expression::Evaluate(). (Add tests ensuring both row filtering 
and pruning obey Kleene logic)

Add some comments on the mechanism of {{Assume()}} too

  was:
There is unnecessary boilerplate required when using the Filter/Expression 
classes. Filter is no longer necessary; it (and FilterVector) can be replaced 
with Expression. Expression is sufficiently general that it can be subclassed 
to provide any custom functionality which would have been added through a 
GenericFilter (add some tests for this).

Additionally rows within RecordBatches yielded from a scan are not currently 
filtered using Expression::Evaluate(). (Add tests ensuring both row filtering 
and pruning obey Kleene logic)


> [C++] Consolidate Filter and Expression classes
> ---
>
> Key: ARROW-6711
> URL: https://issues.apache.org/jira/browse/ARROW-6711
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Ben Kietzman
>Assignee: Ben Kietzman
>Priority: Major
>  Labels: dataset
> Fix For: 1.0.0
>
>
> There is unnecessary boilerplate required when using the Filter/Expression 
> classes. Filter is no longer necessary; it (and FilterVector) can be replaced 
> with Expression. Expression is sufficiently general that it can be subclassed 
> to provide any custom functionality which would have been added through a 
> GenericFilter (add some tests for this).
> Additionally rows within RecordBatches yielded from a scan are not currently 
> filtered using Expression::Evaluate(). (Add tests ensuring both row filtering 
> and pruning obey Kleene logic)
> Add some comments on the mechanism of {{Assume()}} too



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-3808) [R] Implement [.arrow::Array

2019-09-27 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-3808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson reassigned ARROW-3808:
--

Assignee: Neal Richardson

> [R] Implement [.arrow::Array
> 
>
> Key: ARROW-3808
> URL: https://issues.apache.org/jira/browse/ARROW-3808
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: R
>Reporter: Romain Francois
>Assignee: Neal Richardson
>Priority: Major
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6213) [C++] tests fail for AVX512

2019-09-27 Thread Charles Coulombe (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939655#comment-16939655
 ] 

Charles Coulombe commented on ARROW-6213:
-

[~apitrou] sorry for the lack of comments, theses emails ended up being 
filtered. I'll be able to respond quicker now.

I could also provide an AVX512 machine with our build environment if you'd like 
(in North America). 
I work at Compute Canada and we use a build system called EasyBuild. It will be 
easier to share the full log instead so here it is. It contains tests output 
and build steps (and their stdout) : 
[^easybuild-arrow-0.14.1-20190809.34.MgMEK.log]
 Let me know if the tests could be more verbose, I could rerun it with 
different options.

I'll try to find some time in the coming weeks to dig into this but I do not 
know the internals of Arrow.

> [C++] tests fail for AVX512
> ---
>
> Key: ARROW-6213
> URL: https://issues.apache.org/jira/browse/ARROW-6213
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.14.1
> Environment: CentOS 7.6.1810, Intel Xeon Processor (Skylake, IBRS) 
> avx512
>Reporter: Charles Coulombe
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: arrow-0.14.1-c++-failed-tests-cmake-conf.txt, 
> arrow-0.14.1-c++-failed-tests.txt, 
> easybuild-arrow-0.14.1-20190809.34.MgMEK.log
>
>
> When building libraries for avx512 with GCC 7.3.0, two C++ tests fails.
> {noformat}
> The following tests FAILED: 
>   28 - arrow-compute-compare-test (Failed) 
>   30 - arrow-compute-filter-test (Failed) 
> Errors while running CTest{noformat}
> while for avx2 they passes.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6213) [C++] tests fail for AVX512

2019-09-27 Thread Charles Coulombe (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Coulombe updated ARROW-6213:

Attachment: easybuild-arrow-0.14.1-20190809.34.MgMEK.log

> [C++] tests fail for AVX512
> ---
>
> Key: ARROW-6213
> URL: https://issues.apache.org/jira/browse/ARROW-6213
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.14.1
> Environment: CentOS 7.6.1810, Intel Xeon Processor (Skylake, IBRS) 
> avx512
>Reporter: Charles Coulombe
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: arrow-0.14.1-c++-failed-tests-cmake-conf.txt, 
> arrow-0.14.1-c++-failed-tests.txt, 
> easybuild-arrow-0.14.1-20190809.34.MgMEK.log
>
>
> When building libraries for avx512 with GCC 7.3.0, two C++ tests fails.
> {noformat}
> The following tests FAILED: 
>   28 - arrow-compute-compare-test (Failed) 
>   30 - arrow-compute-filter-test (Failed) 
> Errors while running CTest{noformat}
> while for avx2 they passes.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6729) [C++] StlStringBuffer constructor is not zero-copy

2019-09-27 Thread Wes McKinney (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-6729:

Summary: [C++] StlStringBuffer constructor is not zero-copy  (was: 
StlStringBuffer constructor is not zero-copy)

> [C++] StlStringBuffer constructor is not zero-copy
> --
>
> Key: ARROW-6729
> URL: https://issues.apache.org/jira/browse/ARROW-6729
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Pasha Stetsenko
>Priority: Trivial
>
> Fixed in [https://github.com/apache/arrow/pull/5517]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6729) StlStringBuffer constructor is not zero-copy

2019-09-27 Thread Pasha Stetsenko (Jira)
Pasha Stetsenko created ARROW-6729:
--

 Summary: StlStringBuffer constructor is not zero-copy
 Key: ARROW-6729
 URL: https://issues.apache.org/jira/browse/ARROW-6729
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Pasha Stetsenko


Fixed in [https://github.com/apache/arrow/pull/5517]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6726) [CI] Figure out how to run reliably the fuzzit task

2019-09-27 Thread Yevgeny Pats (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939610#comment-16939610
 ] 

Yevgeny Pats commented on ARROW-6726:
-

this should work. but isn't it already committed? can you send me link to the 
failing CI?
{code:java}
// code placeholder
export FUZZIT_API_KEY=
./fuzzit create job --type fuzzing --host bionic-llvm7 
apache-arrow/arrow-ipc-fuzzing {code}

> [CI] Figure out how to run reliably the fuzzit task
> ---
>
> Key: ARROW-6726
> URL: https://issues.apache.org/jira/browse/ARROW-6726
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration
>Reporter: Krisztian Szucs
>Priority: Major
>
> It was disabled in https://github.com/apache/arrow/pull/5528



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6726) [CI] Figure out how to run reliably the fuzzit task

2019-09-27 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939605#comment-16939605
 ] 

Antoine Pitrou commented on ARROW-6726:
---

[~kszucs] Please read your Github notification e-mails. See 
https://github.com/apache/arrow/pull/5407

> [CI] Figure out how to run reliably the fuzzit task
> ---
>
> Key: ARROW-6726
> URL: https://issues.apache.org/jira/browse/ARROW-6726
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration
>Reporter: Krisztian Szucs
>Priority: Major
>
> It was disabled in https://github.com/apache/arrow/pull/5528



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6713) [Python] Getting "ArrowIOError: Corrupted file, smaller than file footer" when reading large number of parquet files to ParquetDataset()

2019-09-27 Thread Harini Kannan (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939566#comment-16939566
 ] 

Harini Kannan commented on ARROW-6713:
--

Ah that was the issue. There were couple of parquet files with zero bytes, 
which was causing the error. Removing them solved the issue. Thanks for the 
help!

> [Python] Getting "ArrowIOError: Corrupted file, smaller than file footer" 
> when reading large number of parquet files to ParquetDataset()
> 
>
> Key: ARROW-6713
> URL: https://issues.apache.org/jira/browse/ARROW-6713
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Harini Kannan
>Priority: Major
>  Labels: parquet
> Attachments: Screen Shot 2019-09-26 at 2.30.49 PM.png
>
>
> When trying to read a large number of parquet files (> 600) into 
> ParquetDataset(), getting the error: 
> ArrowIOError: Corrupted file, smaller than file footer.
>  
> This could be related to this issue: 
> https://issues.apache.org/jira/browse/ARROW-3424
> Note:
> -This works fine for small number of (< 245 to be exact, not sure if this 
> helps) parquet files.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6728) [C#] Support reading and writing Date32 and Date64 arrays

2019-09-27 Thread Eric Erhardt (Jira)
Eric Erhardt created ARROW-6728:
---

 Summary: [C#] Support reading and writing Date32 and Date64 arrays
 Key: ARROW-6728
 URL: https://issues.apache.org/jira/browse/ARROW-6728
 Project: Apache Arrow
  Issue Type: Bug
  Components: C#
Reporter: Eric Erhardt


The C# implementation doesn't support reading and writing Date32 and Date64 
arrays. We need to add support and some tests.

It looks like it is only a couple of lines to get this enabled. See 
[https://github.com/apache/arrow/pull/5413].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-6716) [CI] [Rust] New 1.40.0 nightly causing builds to fail

2019-09-27 Thread Paddy Horan (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paddy Horan resolved ARROW-6716.

Fix Version/s: (was: 1.0.0)
   0.15.0
   Resolution: Fixed

Issue resolved by pull request 5519
[https://github.com/apache/arrow/pull/5519]

> [CI] [Rust] New 1.40.0 nightly causing builds to fail
> -
>
> Key: ARROW-6716
> URL: https://issues.apache.org/jira/browse/ARROW-6716
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: CI, Rust
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> So much for pinning the nightly version ... that doesn't work when there is a 
> new major version of a nightly apparently.
> Travis is now using:
> {code:java}
> rustc 1.40.0-nightly (37538aa13 2019-09-25) {code}
> Despite rust-toolchain containing:
> {code:java}
> nightly-2019-07-30 {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6726) [CI] Figure out how to run reliably the fuzzit task

2019-09-27 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939523#comment-16939523
 ] 

Krisztian Szucs commented on ARROW-6726:


[~yevgenyp] what arguments are required to push the binaries to fuzzit.dev? 

> [CI] Figure out how to run reliably the fuzzit task
> ---
>
> Key: ARROW-6726
> URL: https://issues.apache.org/jira/browse/ARROW-6726
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration
>Reporter: Krisztian Szucs
>Priority: Major
>
> It was disabled in https://github.com/apache/arrow/pull/5528



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6614) [C++][Dataset] Implement FileSystemDataSourceDiscovery

2019-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6614:
--
Labels: dataset pull-request-available  (was: dataset)

> [C++][Dataset] Implement FileSystemDataSourceDiscovery
> --
>
> Key: ARROW-6614
> URL: https://issues.apache.org/jira/browse/ARROW-6614
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Francois Saint-Jacques
>Priority: Major
>  Labels: dataset, pull-request-available
>
> DataSourceDiscovery is what allows InferingSchema and constructing a 
> DataSource with PartitionScheme.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6726) [CI] Figure out how to run reliably the fuzzit task

2019-09-27 Thread Yevgeny Pats (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939492#comment-16939492
 ] 

Yevgeny Pats commented on ARROW-6726:
-

[~kszucs] how can I help? I mean what are the current questions/issues that I 
can address?

> [CI] Figure out how to run reliably the fuzzit task
> ---
>
> Key: ARROW-6726
> URL: https://issues.apache.org/jira/browse/ARROW-6726
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration
>Reporter: Krisztian Szucs
>Priority: Major
>
> It was disabled in https://github.com/apache/arrow/pull/5528



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6727) [Crossbow] Improve the GitHub asset uploading prevent GitHub timeouts

2019-09-27 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-6727:
--

 Summary: [Crossbow] Improve the GitHub asset uploading prevent 
GitHub timeouts
 Key: ARROW-6727
 URL: https://issues.apache.org/jira/browse/ARROW-6727
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration
Reporter: Krisztian Szucs


For large assets artefact uploading occasionally fails: 
https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=1453

Either increate the timeout in github3.py, it that is not possible then retry 
the upload.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6726) [CI] Figure out how to run reliably the fuzzit task

2019-09-27 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-6726:
--

 Summary: [CI] Figure out how to run reliably the fuzzit task
 Key: ARROW-6726
 URL: https://issues.apache.org/jira/browse/ARROW-6726
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration
Reporter: Krisztian Szucs


It was disabled in https://github.com/apache/arrow/pull/5528



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6725) [CI] Disable 3rdparty fuzzit nightly builds

2019-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6725:
--
Labels: pull-request-available  (was: )

> [CI] Disable 3rdparty fuzzit nightly builds
> ---
>
> Key: ARROW-6725
> URL: https://issues.apache.org/jira/browse/ARROW-6725
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Continuous Integration
>Reporter: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
>
> Docker-cpp-fuzzit docker-compose task fails currently, probably misses 
> parameters like version, git object id and credentials. 
> Disable it until we have a solid solution for running them regularly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6725) [CI] Disable 3rdparty fuzzit nightly builds

2019-09-27 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-6725:
--

 Summary: [CI] Disable 3rdparty fuzzit nightly builds
 Key: ARROW-6725
 URL: https://issues.apache.org/jira/browse/ARROW-6725
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration
Reporter: Krisztian Szucs


Docker-cpp-fuzzit docker-compose task fails currently, probably misses 
parameters like version, git object id and credentials. 
Disable it until we have a solid solution for running them regularly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6725) [CI] Disable 3rdparty fuzzit nightly builds

2019-09-27 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-6725:
---
Issue Type: Task  (was: Improvement)

> [CI] Disable 3rdparty fuzzit nightly builds
> ---
>
> Key: ARROW-6725
> URL: https://issues.apache.org/jira/browse/ARROW-6725
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Continuous Integration
>Reporter: Krisztian Szucs
>Priority: Major
>
> Docker-cpp-fuzzit docker-compose task fails currently, probably misses 
> parameters like version, git object id and credentials. 
> Disable it until we have a solid solution for running them regularly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6625) [Python] Allow concat_tables to null or default fill missing columns

2019-09-27 Thread Wes McKinney (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939433#comment-16939433
 ] 

Wes McKinney commented on ARROW-6625:
-

Yes I think that having null fields automatically-promoted to a non-null type 
seems reasonable to me. I am not working on this so please be our guest

> [Python] Allow concat_tables to null or default fill missing columns
> 
>
> Key: ARROW-6625
> URL: https://issues.apache.org/jira/browse/ARROW-6625
> Project: Apache Arrow
>  Issue Type: Wish
>  Components: Python
>Reporter: Daniel Nugent
>Priority: Minor
> Fix For: 1.0.0
>
>
> The concat_tables function currently requires schemas to be identical across 
> all tables to be concat'ed together. However, tables occasionally are 
> conforming on type where present, but a column will be absent.
> In this case, allowing for null filling (or default filling) would be ideal.
> I imagine this feature would be an optional parameter on the concat_tables 
> function. Presumably the argument could be either a boolean in the case of 
> blanket null filling, or a mapping type for default filling. If a user wanted 
> to default fill some columns, but null fill others, they could use a None as 
> the value (defaultdict would make it simple to provide a blanket null fill if 
> only a few default value columns were desired).
> If a mapping wasn't present, the function should probably raise an error.
> The default behavior would be the current and thus the default value of the 
> parameter should be False or None.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6713) [Python] Getting "ArrowIOError: Corrupted file, smaller than file footer" when reading large number of parquet files to ParquetDataset()

2019-09-27 Thread Wes McKinney (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939426#comment-16939426
 ] 

Wes McKinney commented on ARROW-6713:
-

Can you identify the file it's failing on? Is that file readable on its own? 

> [Python] Getting "ArrowIOError: Corrupted file, smaller than file footer" 
> when reading large number of parquet files to ParquetDataset()
> 
>
> Key: ARROW-6713
> URL: https://issues.apache.org/jira/browse/ARROW-6713
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Harini Kannan
>Priority: Major
>  Labels: parquet
> Attachments: Screen Shot 2019-09-26 at 2.30.49 PM.png
>
>
> When trying to read a large number of parquet files (> 600) into 
> ParquetDataset(), getting the error: 
> ArrowIOError: Corrupted file, smaller than file footer.
>  
> This could be related to this issue: 
> https://issues.apache.org/jira/browse/ARROW-3424
> Note:
> -This works fine for small number of (< 245 to be exact, not sure if this 
> helps) parquet files.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6724) [C++] Add simpler static ctor for BufferOutputStream than the current Create function

2019-09-27 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6724:
---

 Summary: [C++] Add simpler static ctor for BufferOutputStream than 
the current Create function
 Key: ARROW-6724
 URL: https://issues.apache.org/jira/browse/ARROW-6724
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Wes McKinney


Not a major rough edge but the current {{Create}} function strikes me as a bit 
awkward since a size and memory pool must be explicitly passed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6723) [Java] Reduce the range of synchronized block when releasing an ArrowBuf

2019-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6723:
--
Labels: pull-request-available  (was: )

> [Java] Reduce the range of synchronized block when releasing an ArrowBuf
> 
>
> Key: ARROW-6723
> URL: https://issues.apache.org/jira/browse/ARROW-6723
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java
>Reporter: Liya Fan
>Assignee: Liya Fan
>Priority: Major
>  Labels: pull-request-available
>
> When releasing an ArrowBuf, we will run the following piece of code:
>   private int decrement(int decrement) {
> allocator.assertOpen();
> final int outcome;
> synchronized (allocationManager) {
>   outcome = bufRefCnt.addAndGet(-decrement);
>   if (outcome == 0) {
> lDestructionTime = System.nanoTime();
> allocationManager.release(this);
>   }
> }
> return outcome;
>   }
> It can be seen that we need to acquire the lock for allocation manager lock, 
> no matter if we need to release the buffer. In addition, the operation of 
> decrementing refcount is only carried out after the lock is acquired. This 
> leads to unnecessary resource contention, and may degrade performance. 
> We propose to change the code like this:
>   private int decrement(int decrement) {
> allocator.assertOpen();
> final int outcome;
> outcome = bufRefCnt.addAndGet(-decrement);
> if (outcome == 0) {
>   lDestructionTime = System.nanoTime();
>   synchronized (allocationManager) {
> allocationManager.release(this);
>   }
> }
> return outcome;
>   }
> Note that this change can be dangerous, as it lies in the core of our code 
> base, so we should be careful with it. On the other hand, it may have 
> non-trivial performance implication. As far as I know, when a distributed 
> task is getting closed, a large number of ArrowBuf will be closed 
> simultaneously. If we reduce the range of the synchronization block, we can 
> significantly improve the performance. 
> What do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6723) [Java] Reduce the range of synchronized block when releasing an ArrowBuf

2019-09-27 Thread Liya Fan (Jira)
Liya Fan created ARROW-6723:
---

 Summary: [Java] Reduce the range of synchronized block when 
releasing an ArrowBuf
 Key: ARROW-6723
 URL: https://issues.apache.org/jira/browse/ARROW-6723
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Java
Reporter: Liya Fan
Assignee: Liya Fan


When releasing an ArrowBuf, we will run the following piece of code:

  private int decrement(int decrement) {
allocator.assertOpen();
final int outcome;
synchronized (allocationManager) {
  outcome = bufRefCnt.addAndGet(-decrement);
  if (outcome == 0) {
lDestructionTime = System.nanoTime();
allocationManager.release(this);
  }
}
return outcome;
  }

It can be seen that we need to acquire the lock for allocation manager lock, no 
matter if we need to release the buffer. In addition, the operation of 
decrementing refcount is only carried out after the lock is acquired. This 
leads to unnecessary resource contention, and may degrade performance. 

We propose to change the code like this:

  private int decrement(int decrement) {
allocator.assertOpen();
final int outcome;
outcome = bufRefCnt.addAndGet(-decrement);
if (outcome == 0) {
  lDestructionTime = System.nanoTime();
  synchronized (allocationManager) {
allocationManager.release(this);
  }
}
return outcome;
  }

Note that this change can be dangerous, as it lies in the core of our code 
base, so we should be careful with it. On the other hand, it may have 
non-trivial performance implication. As far as I know, when a distributed task 
is getting closed, a large number of ArrowBuf will be closed simultaneously. If 
we reduce the range of the synchronization block, we can significantly improve 
the performance. 

What do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6722) [Java] Provide a uniform way to get vector name

2019-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6722:
--
Labels: pull-request-available  (was: )

> [Java] Provide a uniform way to get vector name
> ---
>
> Key: ARROW-6722
> URL: https://issues.apache.org/jira/browse/ARROW-6722
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java
>Reporter: Liya Fan
>Assignee: Liya Fan
>Priority: Minor
>  Labels: pull-request-available
>
> Currently, the getName method is defined in BaseValueVector, as an abstract 
> class. However, some vector does not extend the BaseValueVector, like 
> StructVector, UnionVector, ZeroVector.
> In this issue, we move the method to ValueVector interface, the base 
> interface for all vectors.
> This makes it easier to get a vector's name without checking its type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6722) [Java] Provide a uniform way to get vector name

2019-09-27 Thread Liya Fan (Jira)
Liya Fan created ARROW-6722:
---

 Summary: [Java] Provide a uniform way to get vector name
 Key: ARROW-6722
 URL: https://issues.apache.org/jira/browse/ARROW-6722
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Java
Reporter: Liya Fan
Assignee: Liya Fan


Currently, the getName method is defined in BaseValueVector, as an abstract 
class. However, some vector does not extend the BaseValueVector, like 
StructVector, UnionVector, ZeroVector.
In this issue, we move the method to ValueVector interface, the base interface 
for all vectors.
This makes it easier to get a vector's name without checking its type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6721) [JAVA] Avro adapter benchmark only runs once in JMH

2019-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6721:
--
Labels: pull-request-available  (was: )

> [JAVA] Avro adapter benchmark only runs once in JMH
> ---
>
> Key: ARROW-6721
> URL: https://issues.apache.org/jira/browse/ARROW-6721
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java
>Reporter: Ji Liu
>Assignee: Ji Liu
>Priority: Minor
>  Labels: pull-request-available
>
> The current {{AvroAdapterBenchmark}} actually only run once during JMH 
> evaluation, since the decoder was consumed for the first time and the 
> follow-up invokes will directly return.
> To solve this, we use {{BinaryDecoder}} explicitly in benchmark and reset its 
> inner stream first when the test method is invoked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6721) [JAVA] Avro adapter benchmark only runs once in JMH

2019-09-27 Thread Ji Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ji Liu updated ARROW-6721:
--
Parent: ARROW-5845
Issue Type: Sub-task  (was: Bug)

> [JAVA] Avro adapter benchmark only runs once in JMH
> ---
>
> Key: ARROW-6721
> URL: https://issues.apache.org/jira/browse/ARROW-6721
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java
>Reporter: Ji Liu
>Assignee: Ji Liu
>Priority: Minor
>
> The current {{AvroAdapterBenchmark}} actually only run once during JMH 
> evaluation, since the decoder was consumed for the first time and the 
> follow-up invokes will directly return.
> To solve this, we use {{BinaryDecoder}} explicitly in benchmark and reset its 
> inner stream first when the test method is invoked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6721) [JAVA] Avro adapter benchmark only runs once in JMH

2019-09-27 Thread Ji Liu (Jira)
Ji Liu created ARROW-6721:
-

 Summary: [JAVA] Avro adapter benchmark only runs once in JMH
 Key: ARROW-6721
 URL: https://issues.apache.org/jira/browse/ARROW-6721
 Project: Apache Arrow
  Issue Type: Bug
  Components: Java
Reporter: Ji Liu
Assignee: Ji Liu


The current {{AvroAdapterBenchmark}} actually only run once during JMH 
evaluation, since the decoder was consumed for the first time and the follow-up 
invokes will directly return.

To solve this, we use {{BinaryDecoder}} explicitly in benchmark and reset its 
inner stream first when the test method is invoked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6716) [CI] [Rust] New 1.40.0 nightly causing builds to fail

2019-09-27 Thread Adam Lippai (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939221#comment-16939221
 ] 

Adam Lippai commented on ARROW-6716:


[~andygrove] if this is based on 
[https://github.com/apache/arrow/pull/5502/files] I forgot the nightly setting 
there. With the pinned one RLS didn't work for me on Windows, so I bumped it 
locally.

> [CI] [Rust] New 1.40.0 nightly causing builds to fail
> -
>
> Key: ARROW-6716
> URL: https://issues.apache.org/jira/browse/ARROW-6716
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: CI, Rust
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> So much for pinning the nightly version ... that doesn't work when there is a 
> new major version of a nightly apparently.
> Travis is now using:
> {code:java}
> rustc 1.40.0-nightly (37538aa13 2019-09-25) {code}
> Despite rust-toolchain containing:
> {code:java}
> nightly-2019-07-30 {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-4219) [Rust] [Parquet] Implement ArrowReader

2019-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-4219:
--
Labels: pull-request-available  (was: )

> [Rust] [Parquet] Implement ArrowReader
> --
>
> Key: ARROW-4219
> URL: https://issues.apache.org/jira/browse/ARROW-4219
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Rust
>Reporter: Renjie Liu
>Assignee: Renjie Liu
>Priority: Major
>  Labels: pull-request-available
>
> ArrowReader reads parquet into arrow. In this ticket our goal is to  
> implement get_schema and read row groups into record batch iterator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)