date:20200715

[jira] [Created] (ARROW-9502) [Python][C++] Date64 converted to Date32 on parquet

2020-07-15 Thread Jorge (Jira)

Jorge created ARROW-9502:


 Summary: [Python][C++] Date64 converted to Date32 on parquet
 Key: ARROW-9502
 URL: https://issues.apache.org/jira/browse/ARROW-9502
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, Python
Reporter: Jorge


Executing the example below, 

{code:python}
import datetime
import pyarrow as pa
import pyarrow.parquet

data = [
datetime.datetime(2000, 1, 1, 12, 34, 56, 123456), 
datetime.datetime(2000, 1, 1)
]

data32 = pa.array(data, type='date32')
data64 = pa.array(data, type='date64')
table = pyarrow.Table.from_arrays([data32, data64], names=['a', 'b'])

pyarrow.parquet.write_table(table, 'a.parquet')

print(table)
print()
print(pyarrow.parquet.read_table('a.parquet'))
{code}

yields


{code:java}
pyarrow.Table
a: date32[day]
b: date64[ms]

pyarrow.Table
a: date32[day]
b: date32[day]   <--- IMO it should be date64[ms]
{code}

indicating that pyarrow converted its date64[ms] schema to date32[day]. I used 
the rust crate to print parquet's metadata, and the value is indeed stored as 
i32, which suggests that this likely happens on the writer, not reader.

IMO this does not have any practical implication because they are both dates 
and a 32 bit date (in days) can hold more dates than a 64 bit date in 
milliseconds, but still constitutes an error as the roundtrip serialization 
does not yield the same schema.

A broader question I have is why data64 exists in the first place? I can't see 
any reason to store a *date* in milliseconds since EPOCH.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9501) [C++][Gandiva] Add logic in timestampdiff() when end date is last day of a month

2020-07-15 Thread Sagnik Chakraborty (Jira)

Sagnik Chakraborty created ARROW-9501:
-

 Summary: [C++][Gandiva] Add logic in timestampdiff() when end date 
is last day of a month
 Key: ARROW-9501
 URL: https://issues.apache.org/jira/browse/ARROW-9501
 Project: Apache Arrow
  Issue Type: Task
Reporter: Sagnik Chakraborty


{{timestampdiff}}(*month*, _startDate_, _endDate_) returns wrong result in 
Gandiva when the _endDate_ < _startDate_ and _endDate_ is the last day of the 
month. An additional month is said to have passed when the end day is greater 
than or equal to the start day, but this does not hold true for dates which are 
last days of the month.

Case in point, if _startDate_ = *2020-01-31*, _endDate_ = *2020-02-29*, 
previously {{timestampdiff}}() returned *0*, but the correct result should be 
*1*.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9500) [C++] Fix segfault with std::to_string in -O3 builds on gcc 7.5.0

2020-07-15 Thread Wes McKinney (Jira)

Wes McKinney created ARROW-9500:
---

 Summary: [C++] Fix segfault with std::to_string in -O3 builds on 
gcc 7.5.0
 Key: ARROW-9500
 URL: https://issues.apache.org/jira/browse/ARROW-9500
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Wes McKinney
Assignee: Wes McKinney
 Fix For: 1.0.0


There seems to be a gcc bug related to {{std::to_string}} that only appears in 
{{-O3}} builds. It can be seen in something innocuous like

{code}
return Status::Invalid("Float value ", std::to_string(val), " was truncated 
converting to",
   *output.type());
{code}

where val is NaN. I haven't found a canonical reference but using something 
other than to_string for the formatting (here just letting 
{{std::ostringstream}} take care of it) makes the problem go away

I wasn't able to reproduce the issue with gcc-8




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9499) [C++] AdaptiveIntBuilder::null_count does not return the null count

2020-07-15 Thread Kenta Murata (Jira)

Kenta Murata created ARROW-9499:
---

 Summary: [C++] AdaptiveIntBuilder::null_count does not return the 
null count
 Key: ARROW-9499
 URL: https://issues.apache.org/jira/browse/ARROW-9499
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Kenta Murata
Assignee: Kenta Murata






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9498) [C++][Parquet] Consider revamping RleDecoder based on "upstream" changes in Apache Impala

2020-07-15 Thread Wes McKinney (Jira)

Wes McKinney created ARROW-9498:
---

 Summary: [C++][Parquet] Consider revamping RleDecoder based on 
"upstream" changes in Apache Impala
 Key: ARROW-9498
 URL: https://issues.apache.org/jira/browse/ARROW-9498
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Wes McKinney


Since the initial code import in 2016, Impala made some improvements to 
RleDecoder that we might examine to see if they are beneficial for us

See https://github.com/apache/impala/blob/master/be/src/util/rle-encoding.h and 
history thereof



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [arrow-testing] wesm merged pull request #40: ARROW-9497: [C++][Parquet] Add oss-fuzz test case

2020-07-15 Thread GitBox



wesm merged pull request #40:
URL: https://github.com/apache/arrow-testing/pull/40


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (ARROW-9497) [C++][Parquet] Fix failure caused by malformed repetition/definition levels

2020-07-15 Thread Wes McKinney (Jira)

Wes McKinney created ARROW-9497:
---

 Summary: [C++][Parquet] Fix failure caused by malformed 
repetition/definition levels
 Key: ARROW-9497
 URL: https://issues.apache.org/jira/browse/ARROW-9497
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Wes McKinney
Assignee: Wes McKinney


Fix a case discovered by OSS-Fuzz



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [arrow-testing] wesm opened a new pull request #40: ARROW-9497: [C++][Parquet] Add oss-fuzz test case

2020-07-15 Thread GitBox



wesm opened a new pull request #40:
URL: https://github.com/apache/arrow-testing/pull/40


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (ARROW-9496) toArray() called on filtered Table returns all rows

2020-07-15 Thread Peter Murphy (Jira)

Peter Murphy created ARROW-9496:
---

 Summary: toArray() called on filtered Table returns all rows
 Key: ARROW-9496
 URL: https://issues.apache.org/jira/browse/ARROW-9496
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
 Environment: OSX 10.15.2
Behavior seen in runkit and node.js Jest test runner
Reporter: Peter Murphy


Trying to experiment with building a library on top of Apache Arrow's 
Javascript implementation, but ran into this:

Example:
[https://runkit.com/pjm17971/pond-arrow]
{code:java}
const filtered = table.filter(predicate.col("pressure").lt(28.5))
filtered.count() // 2 (correct)
{code}
 

However:
{code:java}
const result = filtered.toArray().map(row => row.toJSON()) // 4 rows (??){code}
Is this expected behavior?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9495) [C++] Equality assertions don't handle Inf / -Inf properly

2020-07-15 Thread Antoine Pitrou (Jira)

Antoine Pitrou created ARROW-9495:
-

 Summary: [C++] Equality assertions don't handle Inf /  -Inf 
properly
 Key: ARROW-9495
 URL: https://issues.apache.org/jira/browse/ARROW-9495
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Antoine Pitrou
 Fix For: 2.0.0


I got this error when working on a PR which added unit tests:
{code}
../src/arrow/testing/gtest_util.cc:101: Failure
Failed
Expected:
  [
2.5,
inf,
-inf
  ]
Actual:
  [
2.5,
inf,
-inf
  ]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9494) [Rust] master fails due to use of "fXX::NAN"

2020-07-15 Thread Paddy Horan (Jira)

Paddy Horan created ARROW-9494:
--

 Summary: [Rust] master fails due to use of "fXX::NAN"
 Key: ARROW-9494
 URL: https://issues.apache.org/jira/browse/ARROW-9494
 Project: Apache Arrow
  Issue Type: Bug
  Components: Rust
Reporter: Paddy Horan
Assignee: Paddy Horan


I'm getting an error that no associated type exists.  Changing to 
"std::fXX::NAN" fixes the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9493) [Python][Dataset] Dictionary encode string partition columns by default

2020-07-15 Thread Ben Kietzman (Jira)

Ben Kietzman created ARROW-9493:
---

 Summary: [Python][Dataset] Dictionary encode string partition 
columns by default
 Key: ARROW-9493
 URL: https://issues.apache.org/jira/browse/ARROW-9493
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Affects Versions: 0.17.1
Reporter: Ben Kietzman
Assignee: Ben Kietzman
 Fix For: 1.0.0


ARROW-9139 switched the default of use_legacy_dataset from True to False, but 
left dictionary encoding of string partition columns off by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9492) [C++] Ensure private functions are static or in an anonymous namespace

2020-07-15 Thread Ben Kietzman (Jira)

Ben Kietzman created ARROW-9492:
---

 Summary: [C++] Ensure private functions are static or in an 
anonymous namespace
 Key: ARROW-9492
 URL: https://issues.apache.org/jira/browse/ARROW-9492
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Affects Versions: 0.17.1
Reporter: Ben Kietzman


There are a number of functions which are not intended to be exported (for 
example, they are defined in a {{.cc}} file) but are not marked {{static 
inline}} or declared in an anoymous namespace. This can lead to surprising link 
errors. Existing private functions should be marked appropriately, and ideally 
a linter could be assembled to ensure new ones are not added without 
appropriate markings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9491) [Rust] "simd" feature is not testing in CI

2020-07-15 Thread Paddy Horan (Jira)

Paddy Horan created ARROW-9491:
--

 Summary: [Rust] "simd" feature is not testing in CI
 Key: ARROW-9491
 URL: https://issues.apache.org/jira/browse/ARROW-9491
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: Paddy Horan
Assignee: Paddy Horan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9490) pyarrow array creation for specific set of numpy scalars fails

2020-07-15 Thread Ramakrishna Prabhu (Jira)

Ramakrishna Prabhu created ARROW-9490:
-

 Summary: pyarrow array creation for specific set of numpy scalars 
fails
 Key: ARROW-9490
 URL: https://issues.apache.org/jira/browse/ARROW-9490
 Project: Apache Arrow
  Issue Type: Bug
 Environment: conda
Reporter: Ramakrishna Prabhu


While creating array from a list of numpy scalars, pyarrow fails with message 
'Integer scalar type not recognized', details below 
{code:java}
// code placeholder{code}
>>> import pyarrow as pa
>>> import numpy as np
>>> pa.array([np.int32(4), np.float64(1.5), np.float32(1.290994), np.int8(0)])
Traceback (most recent call last):
 File "", line 1, in 
 File "pyarrow/array.pxi", line 269, in pyarrow.lib.array
 File "pyarrow/array.pxi", line 38, in pyarrow.lib._sequence_to_array
 File "pyarrow/error.pxi", line 85, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Integer scalar type not recognized



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9489) [C++] Add fill_null kernel implementation for (array[string], scalar[string])

2020-07-15 Thread Uwe Korn (Jira)

Uwe Korn created ARROW-9489:
---

 Summary: [C++] Add fill_null kernel implementation for 
(array[string], scalar[string])
 Key: ARROW-9489
 URL: https://issues.apache.org/jira/browse/ARROW-9489
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Uwe Korn
 Fix For: 2.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9488) [Release] Use the new changelog generation when updating the website

2020-07-15 Thread Krisztian Szucs (Jira)

Krisztian Szucs created ARROW-9488:
--

 Summary: [Release] Use the new changelog generation when updating 
the website
 Key: ARROW-9488
 URL: https://issues.apache.org/jira/browse/ARROW-9488
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Developer Tools
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs
 Fix For: 2.0.0


The following command updates the CHANGELOG.md, but the same content should be 
added as release notes in the post-03-website.sh script.

See todo note 
https://github.com/apache/arrow/pull/7162/files#diff-58442bc78393d2113825def6aad913a0R143

{code}
archery release changelog add 1.0.0
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9487) [Developer] Cover the archery release utilities with unittests

2020-07-15 Thread Krisztian Szucs (Jira)

Krisztian Szucs created ARROW-9487:
--

 Summary: [Developer] Cover the archery release utilities with 
unittests
 Key: ARROW-9487
 URL: https://issues.apache.org/jira/browse/ARROW-9487
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Archery, Developer Tools
Reporter: Krisztian Szucs
 Fix For: 2.0.0


Deferring the unittest of https://github.com/apache/arrow/pull/7162 to this 
JIRA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9486) [C++][Dataset] Support implicit casting InExpression::set_ to dict

2020-07-15 Thread Ben Kietzman (Jira)

Ben Kietzman created ARROW-9486:
---

 Summary: [C++][Dataset] Support implicit casting 
InExpression::set_ to dict
 Key: ARROW-9486
 URL: https://issues.apache.org/jira/browse/ARROW-9486
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.17.1
Reporter: Ben Kietzman
Assignee: Ben Kietzman
 Fix For: 1.0.0


{{test_filters_inclusive_set}} is still failing due to lack of support for cast 
to dictionary. Add fallbacks to DictionaryEncode if conversion to a dictionary 
array is required



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9485) [R] Better shared library stripping

2020-07-15 Thread Neal Richardson (Jira)

Neal Richardson created ARROW-9485:
--

 Summary: [R] Better shared library stripping
 Key: ARROW-9485
 URL: https://issues.apache.org/jira/browse/ARROW-9485
 Project: Apache Arrow
  Issue Type: Improvement
  Components: R
Reporter: Neal Richardson
Assignee: Neal Richardson
 Fix For: 1.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9484) [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread Neal Richardson (Jira)

Neal Richardson created ARROW-9484:
--

 Summary: [Docs] Update is* functions to be is_* in the compute docs
 Key: ARROW-9484
 URL: https://issues.apache.org/jira/browse/ARROW-9484
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Documentation
Reporter: Neal Richardson
Assignee: Neal Richardson
 Fix For: 1.0.0


Followup to the followup ARROW-9390



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9483) [C++] Reorganize testing headers

2020-07-15 Thread Antoine Pitrou (Jira)

Antoine Pitrou created ARROW-9483:
-

 Summary: [C++] Reorganize testing headers
 Key: ARROW-9483
 URL: https://issues.apache.org/jira/browse/ARROW-9483
 Project: Apache Arrow
  Issue Type: Wish
  Components: C++
Reporter: Antoine Pitrou
 Fix For: 2.0.0


Currently, {{gtest_util.h}} contains a hodge-podge of different things.
It would be nice if things were separated a bit more, for example a 
{{asserts.h}} file for all home-grown assertion functions and macros.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9482) [Rust] [DataFusion] Implement pretty print for physical query plan

2020-07-15 Thread Andy Grove (Jira)

Andy Grove created ARROW-9482:
-

 Summary: [Rust] [DataFusion] Implement pretty print for physical 
query plan
 Key: ARROW-9482
 URL: https://issues.apache.org/jira/browse/ARROW-9482
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Rust, Rust - DataFusion
Reporter: Andy Grove
Assignee: Andy Grove


Implement pretty print for physical query plan. similar to what we have for the 
logical plan.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9481) [Rust] [DataFusion] Create physical plan enum to wrap execution plan

2020-07-15 Thread Andy Grove (Jira)

Andy Grove created ARROW-9481:
-

 Summary: [Rust] [DataFusion] Create physical plan enum to wrap 
execution plan
 Key: ARROW-9481
 URL: https://issues.apache.org/jira/browse/ARROW-9481
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Rust, Rust - DataFusion
Reporter: Andy Grove
Assignee: Andy Grove


By wrapping the execution plan structs in an enum, we make it possible to build 
a tree representing the physical plan just like we do with the logical plan. 
This makes it easy to print physical plans and also to apply transformations to 
it.
{code:java}
 pub enum PhysicalPlan {
/// Projection.
Projection(Arc),
/// Filter a.k.a predicate.
Filter(Arc),
/// Hash aggregate
HashAggregate(Arc),
/// Performs a hash join of two child relations by first shuffling the data 
using the join keys.
ShuffledHashJoin(ShuffledHashJoinExec),
/// Performs a shuffle that will result in the desired partitioning.
ShuffleExchange(Arc),
/// Reads results from a ShuffleExchange
ShuffleReader(Arc),
/// Scans a partitioned data source
ParquetScan(Arc),
/// Scans an in-memory table
InMemoryTableScan(Arc),
}{code}
h3.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9480) [Rust] [DataFusion] All DataFusion execution plan traits should require Send + Sync

2020-07-15 Thread Andy Grove (Jira)

Andy Grove created ARROW-9480:
-

 Summary: [Rust] [DataFusion] All DataFusion execution plan traits 
should require Send + Sync
 Key: ARROW-9480
 URL: https://issues.apache.org/jira/browse/ARROW-9480
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Rust, Rust - DataFusion
Reporter: Andy Grove
Assignee: Andy Grove


All DataFusion execution plan traits should require Send + Sync, to prepare for 
async support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9479) [JS] Table.from fails for zero-item Lists, FixedSizeLists, Maps. ditto Table.empty

2020-07-15 Thread Nicholas Roberts (Jira)

Nicholas Roberts created ARROW-9479:
---

 Summary: [JS] Table.from fails for zero-item Lists, 
FixedSizeLists, Maps. ditto Table.empty
 Key: ARROW-9479
 URL: https://issues.apache.org/jira/browse/ARROW-9479
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: 0.17.1
Reporter: Nicholas Roberts


deserializing zero-item tables (as generated by Table.empty or, in this case, 
pyarrow.Schema.serialize) with a schema containing a List, FixedList or Map 
fail due to an unconditional 
{code:java}
new Data(/* preceding parameters */ buffers, [childData]){code}
statement, the childData parameter resolves to  [undefined] rather than the 
desired [].

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9478) [C++] Improve error message on unsupported cast types

2020-07-15 Thread Antoine Pitrou (Jira)

Antoine Pitrou created ARROW-9478:
-

 Summary: [C++] Improve error message on unsupported cast types
 Key: ARROW-9478
 URL: https://issues.apache.org/jira/browse/ARROW-9478
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Antoine Pitrou
Assignee: Antoine Pitrou


Currently, the error message when trying an unsupported cast looks like this:
{code}
No cast function available to cast to dictionary
{code}

It would be more informative if the source type was also mentioned.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9477) [C++] Fix test case TestSchemaMetadata.MetadataVersionForwardCompatibility

2020-07-15 Thread Liya Fan (Jira)

Liya Fan created ARROW-9477:
---

 Summary: [C++] Fix test case 
TestSchemaMetadata.MetadataVersionForwardCompatibility
 Key: ARROW-9477
 URL: https://issues.apache.org/jira/browse/ARROW-9477
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Liya Fan


Test case TestSchemaMetadata.MetadataVersionForwardCompatibility is failing in 
master branch. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9476) [C++][Dataset] HivePartitioning discovery with dictionary types fails for multiple fields

2020-07-15 Thread Joris Van den Bossche (Jira)

Joris Van den Bossche created ARROW-9476:


 Summary: [C++][Dataset] HivePartitioning discovery with dictionary 
types fails for multiple fields
 Key: ARROW-9476
 URL: https://issues.apache.org/jira/browse/ARROW-9476
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Joris Van den Bossche


Apparently, ARROW-9288 was not fully / correctly fixing the issue. With a 
single string partition field, it now works fine. But once you have multiple 
string fields, you get parsing errors.

A reproducible example:

{code}
import numpy as np
import pyarrow as pa
import pyarrow.parquet as pq
import pyarrow.dataset as ds 

foo_keys = np.array(['a', 'b', 'c'], dtype=object)
bar_keys = np.array(['d', 'e', 'f'], dtype=object)
N = 30

table = pa.table({
'foo': foo_keys.repeat(10),
'bar': np.tile(np.tile(bar_keys, 5), 2),
'values': np.random.randn(N)
})

base_path = "test_partition_directories3"
pq.write_to_dataset(table, base_path, partition_cols=["bar", "foo"])

# works
ds.dataset(base_path, partitioning="hive")
# fails
part = ds.HivePartitioning.discover(max_partition_dictionary_size=-1)
ds.dataset(base_path, partitioning=part)
{code}


cc [~bkietz]





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9475) Clean up usages of BaseAllocator, use BufferAllocator instead

2020-07-15 Thread Hongze Zhang (Jira)

Hongze Zhang created ARROW-9475:
---

 Summary: Clean up usages of BaseAllocator, use BufferAllocator 
instead
 Key: ARROW-9475
 URL: https://issues.apache.org/jira/browse/ARROW-9475
 Project: Apache Arrow
  Issue Type: Improvement
Affects Versions: 0.17.0
Reporter: Hongze Zhang
Assignee: Hongze Zhang


Some classes' methods use BaseAllocator or cast BufferAllocator to 
BaseAllocator internally instead of requiring for BufferAllocator directly, 
e.g. codes in AllocationManager, BufferLedger.

This can be optimized by exposing necessary methods from BufferAllocator. 




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-9502) [Python][C++] Date64 converted to Date32 on parquet

[jira] [Created] (ARROW-9501) [C++][Gandiva] Add logic in timestampdiff() when end date is last day of a month

[jira] [Created] (ARROW-9500) [C++] Fix segfault with std::to_string in -O3 builds on gcc 7.5.0

[jira] [Created] (ARROW-9499) [C++] AdaptiveIntBuilder::null_count does not return the null count

[jira] [Created] (ARROW-9498) [C++][Parquet] Consider revamping RleDecoder based on "upstream" changes in Apache Impala

[GitHub] [arrow-testing] wesm merged pull request #40: ARROW-9497: [C++][Parquet] Add oss-fuzz test case

[jira] [Created] (ARROW-9497) [C++][Parquet] Fix failure caused by malformed repetition/definition levels

[GitHub] [arrow-testing] wesm opened a new pull request #40: ARROW-9497: [C++][Parquet] Add oss-fuzz test case

[jira] [Created] (ARROW-9496) toArray() called on filtered Table returns all rows

[jira] [Created] (ARROW-9495) [C++] Equality assertions don't handle Inf / -Inf properly

[jira] [Created] (ARROW-9494) [Rust] master fails due to use of "fXX::NAN"

[jira] [Created] (ARROW-9493) [Python][Dataset] Dictionary encode string partition columns by default

[jira] [Created] (ARROW-9492) [C++] Ensure private functions are static or in an anonymous namespace

[jira] [Created] (ARROW-9491) [Rust] "simd" feature is not testing in CI

[jira] [Created] (ARROW-9490) pyarrow array creation for specific set of numpy scalars fails

[jira] [Created] (ARROW-9489) [C++] Add fill_null kernel implementation for (array[string], scalar[string])

[jira] [Created] (ARROW-9488) [Release] Use the new changelog generation when updating the website

[jira] [Created] (ARROW-9487) [Developer] Cover the archery release utilities with unittests

[jira] [Created] (ARROW-9486) [C++][Dataset] Support implicit casting InExpression::set_ to dict

[jira] [Created] (ARROW-9485) [R] Better shared library stripping

[jira] [Created] (ARROW-9484) [Docs] Update is* functions to be is_* in the compute docs

[jira] [Created] (ARROW-9483) [C++] Reorganize testing headers

[jira] [Created] (ARROW-9482) [Rust] [DataFusion] Implement pretty print for physical query plan

[jira] [Created] (ARROW-9481) [Rust] [DataFusion] Create physical plan enum to wrap execution plan

[jira] [Created] (ARROW-9480) [Rust] [DataFusion] All DataFusion execution plan traits should require Send + Sync

[jira] [Created] (ARROW-9479) [JS] Table.from fails for zero-item Lists, FixedSizeLists, Maps. ditto Table.empty

[jira] [Created] (ARROW-9478) [C++] Improve error message on unsupported cast types

[jira] [Created] (ARROW-9477) [C++] Fix test case TestSchemaMetadata.MetadataVersionForwardCompatibility

[jira] [Created] (ARROW-9476) [C++][Dataset] HivePartitioning discovery with dictionary types fails for multiple fields

[jira] [Created] (ARROW-9475) Clean up usages of BaseAllocator, use BufferAllocator instead

30 matches

Site Navigation

Mail list logo

Footer information