[jira] [Commented] (ARROW-6895) [C++][Parquet] parquet::arrow::ColumnReader: ByteArrayDictionaryRecordReader repeats returned values when calling `NextBatch()`

2020-02-18 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039212#comment-17039212 ] Francois Saint-Jacques commented on ARROW-6895: --- Thanks for the followup Adam, would you

[jira] [Assigned] (ARROW-7338) [C++] Improve InMemoryDataSource to support generator instead of static list

2020-02-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-7338: - Assignee: Francois Saint-Jacques > [C++] Improve InMemoryDataSource to

[jira] [Commented] (ARROW-7781) [C++][Dataset] Filtering on a non-existent column gives a segfault

2020-02-12 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035897#comment-17035897 ] Francois Saint-Jacques commented on ARROW-7781: --- [~bkietz] was this fixed? >

[jira] [Created] (ARROW-7821) [Gandiva] Add support for literal variables

2020-02-10 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7821: - Summary: [Gandiva] Add support for literal variables Key: ARROW-7821 URL: https://issues.apache.org/jira/browse/ARROW-7821 Project: Apache Arrow

[jira] [Created] (ARROW-7819) [C++][Gandiva] Implement gandiva-dump-ir tool to output llvm IR to a file

2020-02-10 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7819: - Summary: [C++][Gandiva] Implement gandiva-dump-ir tool to output llvm IR to a file Key: ARROW-7819 URL: https://issues.apache.org/jira/browse/ARROW-7819

[jira] [Created] (ARROW-7818) [C++][Gandiva] Generate Filter kernels from gandiva code at compile time

2020-02-10 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7818: - Summary: [C++][Gandiva] Generate Filter kernels from gandiva code at compile time Key: ARROW-7818 URL: https://issues.apache.org/jira/browse/ARROW-7818

[jira] [Created] (ARROW-7820) [C++][Gandiva] Add CMake support for compiling LLVM's IR into a library

2020-02-10 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7820: - Summary: [C++][Gandiva] Add CMake support for compiling LLVM's IR into a library Key: ARROW-7820 URL: https://issues.apache.org/jira/browse/ARROW-7820

[jira] [Commented] (ARROW-7825) Have arrow::read_parquet respect options(stringsAsFactors = FALSE)

2020-02-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033832#comment-17033832 ] Francois Saint-Jacques commented on ARROW-7825: --- Side note, the Arrow CSV reader has the

[jira] [Updated] (ARROW-7523) [Tools] Relax clang-tidy check

2020-01-08 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-7523: -- Summary: [Tools] Relax clang-tidy check (was: [Tools] Ignore

[jira] [Created] (ARROW-7523) [Tools] Ignore modernize-use-trailing-return-type clang-tidy check

2020-01-08 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7523: - Summary: [Tools] Ignore modernize-use-trailing-return-type clang-tidy check Key: ARROW-7523 URL: https://issues.apache.org/jira/browse/ARROW-7523

[jira] [Resolved] (ARROW-7527) [Python] pandas/feather tests failing on pandas master

2020-01-09 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-7527. --- Resolution: Fixed Issue resolved by pull request 6147

[jira] [Commented] (ARROW-7510) [C++] Array::null_count() is not thread-compatible

2020-01-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013578#comment-17013578 ] Francois Saint-Jacques commented on ARROW-7510: --- It has an accessor GetNullCount, but the

[jira] [Resolved] (ARROW-7576) [C++][Dev] Improve fuzzing setup

2020-01-15 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-7576. --- Fix Version/s: 0.16.0 Resolution: Fixed Issue resolved by pull

[jira] [Assigned] (ARROW-7576) [C++][Dev] Improve fuzzing setup

2020-01-15 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-7576: - Assignee: Antoine Pitrou > [C++][Dev] Improve fuzzing setup >

[jira] [Updated] (ARROW-6895) [C++][Parquet] parquet::arrow::ColumnReader: ByteArrayDictionaryRecordReader repeats returned values when calling `NextBatch()`

2020-01-15 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-6895: -- Priority: Critical (was: Major) > [C++][Parquet]

[jira] [Assigned] (ARROW-7545) [C++] [Dataset] Scanning dataset with dictionary type hangs

2020-01-13 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-7545: - Assignee: Francois Saint-Jacques > [C++] [Dataset] Scanning dataset

[jira] [Resolved] (ARROW-7545) [C++] [Dataset] Scanning dataset with dictionary type hangs

2020-01-14 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-7545. --- Resolution: Duplicate > [C++] [Dataset] Scanning dataset with dictionary

[jira] [Assigned] (ARROW-6895) [C++][Parquet] parquet::arrow::ColumnReader: ByteArrayDictionaryRecordReader repeats returned values when calling `NextBatch()`

2020-01-14 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-6895: - Assignee: Francois Saint-Jacques (was: Wes McKinney) > [C++][Parquet]

[jira] [Assigned] (ARROW-7640) [C++][Dataset] segfault when reading compressed Parquet files if build didn't include support for codec

2020-01-21 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-7640: - Assignee: Francois Saint-Jacques > [C++][Dataset] segfault when reading

[jira] [Updated] (ARROW-7650) [C++] Dataset tests not built on Windows

2020-01-22 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-7650: -- Priority: Blocker (was: Critical) > [C++] Dataset tests not built on Windows

[jira] [Assigned] (ARROW-7650) [C++] Dataset tests not built on Windows

2020-01-22 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-7650: - Assignee: Francois Saint-Jacques > [C++] Dataset tests not built on

[jira] [Assigned] (ARROW-7650) [C++] Dataset tests not built on Windows

2020-01-22 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-7650: - Assignee: Ben Kietzman (was: Francois Saint-Jacques) > [C++] Dataset

[jira] [Updated] (ARROW-7650) [C++] Dataset tests not built on Windows

2020-01-22 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-7650: -- Fix Version/s: 0.16.0 > [C++] Dataset tests not built on Windows >

[jira] [Updated] (ARROW-7640) [C++][Dataset] segfault when reading compressed Parquet files if build didn't include support for codec

2020-01-22 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-7640: -- Fix Version/s: 0.16.0 > [C++][Dataset] segfault when reading compressed

[jira] [Updated] (ARROW-7640) [C++][Dataset] segfault when reading compressed Parquet files if build didn't include support for codec

2020-01-22 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-7640: -- Priority: Blocker (was: Major) > [C++][Dataset] segfault when reading

[jira] [Created] (ARROW-7653) [C++][Dataset] Handle DictType index mismatch better

2020-01-22 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7653: - Summary: [C++][Dataset] Handle DictType index mismatch better Key: ARROW-7653 URL: https://issues.apache.org/jira/browse/ARROW-7653 Project: Apache

[jira] [Resolved] (ARROW-7600) [C++][Parquet] Add a basic disabled unit test to excercise nesting functionality

2020-01-17 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-7600. --- Fix Version/s: 0.16.0 Resolution: Fixed Issue resolved by pull

[jira] [Commented] (ARROW-3873) [C++] Build shared libraries consistently with -fvisibility=hidden

2020-01-17 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018044#comment-17018044 ] Francois Saint-Jacques commented on ARROW-3873: --- [~apitrou] would that be solved by using

[jira] [Resolved] (ARROW-7593) [CI][Python] Python datasets failing on master / not run on CI

2020-01-17 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-7593. --- Resolution: Fixed Issue resolved by pull request 6214

[jira] [Created] (ARROW-7602) [Archery] Add more build options

2020-01-17 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7602: - Summary: [Archery] Add more build options Key: ARROW-7602 URL: https://issues.apache.org/jira/browse/ARROW-7602 Project: Apache Arrow

[jira] [Resolved] (ARROW-7602) [Archery] Add more build options

2020-01-17 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-7602. --- Fix Version/s: 0.16.0 Resolution: Fixed Issue resolved by pull

[jira] [Resolved] (ARROW-7621) [Doc] Doc build fails

2020-01-20 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-7621. --- Fix Version/s: 0.16.0 Resolution: Fixed Issue resolved by pull

[jira] [Assigned] (ARROW-7608) [C++][Dataset] Expose more informational properties

2020-01-21 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-7608: - Assignee: Francois Saint-Jacques > [C++][Dataset] Expose more

[jira] [Commented] (ARROW-7608) [C++][Dataset] Expose more informational properties

2020-01-21 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020298#comment-17020298 ] Francois Saint-Jacques commented on ARROW-7608: --- * Dataset's source is found via

[jira] [Updated] (ARROW-7380) [C++][Dataset] Implement DatasetFactory

2020-01-20 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-7380: -- Summary: [C++][Dataset] Implement DatasetFactory (was: [C++][Dataset]

[jira] [Assigned] (ARROW-7338) [C++] Improve InMemoryDataSource to support generator instead of static list

2020-01-20 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-7338: - Assignee: (was: Francois Saint-Jacques) > [C++] Improve

[jira] [Commented] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-09 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011939#comment-17011939 ] Francois Saint-Jacques commented on ARROW-7498: --- You are right that it doesn't partition

[jira] [Commented] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-08 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010773#comment-17010773 ] Francois Saint-Jacques commented on ARROW-7498: --- Partitioning or Partitioner? >

[jira] [Assigned] (ARROW-7376) [C++] parquet NaN/null double statistics can result in endless loop

2020-01-08 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-7376: - Assignee: Francois Saint-Jacques > [C++] parquet NaN/null double

[jira] [Commented] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-08 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010947#comment-17010947 ] Francois Saint-Jacques commented on ARROW-7498: --- For SchemaPartitioner (each directory is a

[jira] [Comment Edited] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-08 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010947#comment-17010947 ] Francois Saint-Jacques edited comment on ARROW-7498 at 1/8/20 7:09 PM:

[jira] [Updated] (ARROW-7376) [C++] parquet NaN/null double statistics can result in endless loop

2020-01-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-7376: -- Priority: Critical (was: Major) > [C++] parquet NaN/null double statistics

[jira] [Commented] (ARROW-7545) [C++] [Dataset] Scanning dataset with dictionary type hangs

2020-01-14 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015062#comment-17015062 ] Francois Saint-Jacques commented on ARROW-7545: --- It looks like this is a parquet issue

[jira] [Updated] (ARROW-7545) [C++] [Dataset] Scanning dataset with dictionary type hangs

2020-01-14 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-7545: -- Component/s: (was: C++ - Dataset) C++ > [C++] [Dataset]

[jira] [Created] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-06 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7498: - Summary: [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme Key: ARROW-7498 URL: https://issues.apache.org/jira/browse/ARROW-7498

[jira] [Commented] (ARROW-7501) [C++] CMake build_thrift should build flex and bison if necessary

2020-01-06 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009146#comment-17009146 ] Francois Saint-Jacques commented on ARROW-7501: --- It doesn't look like OSX is

[jira] [Assigned] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-07 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-7498: - Assignee: Francois Saint-Jacques > [C++][Dataset] Rename

[jira] [Updated] (ARROW-7338) [C++] Improve InMemoryDataSource to support generator instead of static list

2020-01-07 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-7338: -- Summary: [C++] Improve InMemoryDataSource to support generator instead of

[jira] [Commented] (ARROW-7510) [C++] Array::null_count() is not thread-compatible

2020-01-07 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009941#comment-17009941 ] Francois Saint-Jacques commented on ARROW-7510: --- Is this really an issue? Worst case, 2

[jira] [Updated] (ARROW-8058) [C++][Python][Dataset] Provide an option to toggle validation and schema inference in FileSystemDatasetFactoryOptions

2020-03-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8058: -- Summary: [C++][Python][Dataset] Provide an option to toggle validation and

[jira] [Commented] (ARROW-8118) [R] dim method for FileSystemDataset

2020-03-14 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059350#comment-17059350 ] Francois Saint-Jacques commented on ARROW-8118: --- A small point, getting the number of rows

[jira] [Commented] (ARROW-8028) [Go] Allow duplicate field names in schemas and nested types

2020-03-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057247#comment-17057247 ] Francois Saint-Jacques commented on ARROW-8028: --- {code:java} SELECT 1 AS one, 1 AS

[jira] [Commented] (ARROW-8061) [C++][Dataset] Ability to specify granularity of ParquetFileFragment (support row groups)

2020-03-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056211#comment-17056211 ] Francois Saint-Jacques commented on ARROW-8061: --- Yes, this is possible, a ParquetFragment

[jira] [Created] (ARROW-8065) [C++][Dataset] Untangle Dataset, Fragment and ScanOptions

2020-03-10 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8065: - Summary: [C++][Dataset] Untangle Dataset, Fragment and ScanOptions Key: ARROW-8065 URL: https://issues.apache.org/jira/browse/ARROW-8065 Project:

[jira] [Updated] (ARROW-8065) [C++][Dataset] Untangle Dataset, Fragment and ScanOptions

2020-03-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8065: -- Component/s: C++ - Dataset > [C++][Dataset] Untangle Dataset, Fragment and

[jira] [Updated] (ARROW-7798) [R] Refactor R <-> Array conversion

2020-04-08 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-7798: -- Description: There's a bit of technical debt accumulated in array_to_vector

[jira] [Updated] (ARROW-7798) [R] Refactor R <-> Array conversion

2020-04-08 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-7798: -- Summary: [R] Refactor R <-> Array conversion (was: [R] Refactor vector to

[jira] [Resolved] (ARROW-8376) [R] Add experimental interface to ScanTask/RecordBatch iterators

2020-04-09 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-8376. --- Fix Version/s: 0.17.0 Resolution: Fixed Issue resolved by pull

[jira] [Created] (ARROW-8374) [R] Table to vector of DictonaryType will error when Arrays don't have the same Dictionary per array

2020-04-08 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8374: - Summary: [R] Table to vector of DictonaryType will error when Arrays don't have the same Dictionary per array Key: ARROW-8374 URL:

[jira] [Updated] (ARROW-8374) [R] Table to vector of DictonaryType will error when Arrays don't have the same Dictionary per array

2020-04-08 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8374: -- Component/s: R > [R] Table to vector of DictonaryType will error when Arrays

[jira] [Assigned] (ARROW-8065) [C++][Dataset] Untangle Dataset, Fragment and ScanOptions

2020-04-15 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-8065: - Assignee: Francois Saint-Jacques > [C++][Dataset] Untangle Dataset,

[jira] [Commented] (ARROW-8276) [C++][Dataset] Scanning a Fragment does not take into account the partition columns

2020-04-15 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17084327#comment-17084327 ] Francois Saint-Jacques commented on ARROW-8276: --- [~jorisvandenbossche]  I'm refactoring

[jira] [Updated] (ARROW-8488) [R] Replace VALUE_OR_STOP with ValueOrStop

2020-04-16 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8488: -- Component/s: R > [R] Replace VALUE_OR_STOP with ValueOrStop >

[jira] [Created] (ARROW-8488) [R] Replace VALUE_OR_STOP with ValueOrStop

2020-04-16 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8488: - Summary: [R] Replace VALUE_OR_STOP with ValueOrStop Key: ARROW-8488 URL: https://issues.apache.org/jira/browse/ARROW-8488 Project: Apache Arrow

[jira] [Resolved] (ARROW-8474) [CI][Crossbow] Skip some nightlies we don't need to run

2020-04-16 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-8474. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request

[jira] [Created] (ARROW-8348) [C++] Support optional sentinel values in primitive Array for nulls

2020-04-06 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8348: - Summary: [C++] Support optional sentinel values in primitive Array for nulls Key: ARROW-8348 URL: https://issues.apache.org/jira/browse/ARROW-8348

[jira] [Created] (ARROW-8354) [C++][R] Segfault in test-dataset.r

2020-04-06 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8354: - Summary: [C++][R] Segfault in test-dataset.r Key: ARROW-8354 URL: https://issues.apache.org/jira/browse/ARROW-8354 Project: Apache Arrow

[jira] [Updated] (ARROW-8427) [C++][Dataset] Do not ignore file paths with underscore/dot when full path was specified

2020-04-13 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8427: -- Labels: dataset pull-request-available (was: pull-request-available) >

[jira] [Resolved] (ARROW-8427) [C++][Dataset] Do not ignore file paths with underscore/dot when full path was specified

2020-04-13 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-8427. --- Resolution: Fixed Issue resolved by pull request 6915

[jira] [Created] (ARROW-8447) [C++][Dataset] Ensure Scanner::ToTable preserve ordering

2020-04-14 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8447: - Summary: [C++][Dataset] Ensure Scanner::ToTable preserve ordering Key: ARROW-8447 URL: https://issues.apache.org/jira/browse/ARROW-8447 Project:

[jira] [Resolved] (ARROW-8488) [R] Replace VALUE_OR_STOP with ValueOrStop

2020-04-20 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-8488. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request

[jira] [Created] (ARROW-8381) [C++][Dataset] Dataset writing should require a writer schema

2020-04-09 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8381: - Summary: [C++][Dataset] Dataset writing should require a writer schema Key: ARROW-8381 URL: https://issues.apache.org/jira/browse/ARROW-8381

[jira] [Comment Edited] (ARROW-8382) [C++][Dataset] Refactor WritePlan to decouple from Fragment/Scan/Partition classes

2020-04-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081130#comment-17081130 ] Francois Saint-Jacques edited comment on ARROW-8382 at 4/11/20, 3:52 AM:

[jira] [Comment Edited] (ARROW-8382) [C++][Dataset] Refactor WritePlan to decouple from Fragment/Scan/Partition classes

2020-04-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081130#comment-17081130 ] Francois Saint-Jacques edited comment on ARROW-8382 at 4/11/20, 3:51 AM:

[jira] [Commented] (ARROW-8382) [C++][Dataset] Refactor WritePlan to decouple from Fragment/Scan/Partition classes

2020-04-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081130#comment-17081130 ] Francois Saint-Jacques commented on ARROW-8382: --- The end goal of this is to write data to

[jira] [Updated] (ARROW-8065) [C++][Dataset] Untangle Dataset, Fragment and ScanOptions

2020-04-09 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8065: -- Description: Currently: a fragment is a product of a scan; it is a lazy

[jira] [Created] (ARROW-8497) [Archery] Add missing component to builds

2020-04-17 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8497: - Summary: [Archery] Add missing component to builds Key: ARROW-8497 URL: https://issues.apache.org/jira/browse/ARROW-8497 Project: Apache Arrow

[jira] [Updated] (ARROW-8382) [C++][Dataset] Refactor WritePlan to decouple from Fragment/Scan/Partition classes

2020-04-09 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8382: -- Description: WritePlan should look like the following. {code:c++} class

[jira] [Updated] (ARROW-8382) [C++][Dataset] Refactor WritePlan to decouple from Fragment/Scan/Partition classes

2020-04-09 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8382: -- Component/s: C++ - Dataset > [C++][Dataset] Refactor WritePlan to decouple

[jira] [Updated] (ARROW-8382) [C++][Dataset] Refactor WritePlan to decouple from Fragment/Scan/Partition classes

2020-04-09 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8382: -- Description: WritePlan should look like the following. {code:c++} class

[jira] [Created] (ARROW-8382) [C++][Dataset] Refactor WritePlan to decouple from Fragment/Scan/Partition classes

2020-04-09 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8382: - Summary: [C++][Dataset] Refactor WritePlan to decouple from Fragment/Scan/Partition classes Key: ARROW-8382 URL:

[jira] [Updated] (ARROW-8382) [C++][Dataset] Refactor WritePlan to decouple from Fragment/Scan/Partition classes

2020-04-09 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8382: -- Description: WritePlan should look like the following. {code:c++} class

[jira] [Commented] (ARROW-8111) [C++][CSV] Support DD/MM/YYYY date format

2020-03-13 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058777#comment-17058777 ] Francois Saint-Jacques commented on ARROW-8111: --- And if you do, ensure to write the

[jira] [Assigned] (ARROW-7798) [R] Refactor vector to Array conversion

2020-03-25 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-7798: - Assignee: (was: Francois Saint-Jacques) > [R] Refactor vector to

[jira] [Assigned] (ARROW-7818) [C++][Gandiva] Generate Filter kernels from gandiva code at compile time

2020-03-25 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-7818: - Assignee: (was: Francois Saint-Jacques) > [C++][Gandiva] Generate

[jira] [Resolved] (ARROW-8061) [C++][Dataset] Ability to specify granularity of ParquetFileFragment (support row groups)

2020-03-27 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-8061. --- Fix Version/s: 0.17.0 Resolution: Fixed Issue resolved by pull

[jira] [Updated] (ARROW-7740) [R] Crash/bad data in converting Arrow list struct type

2020-03-31 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-7740: -- Component/s: (was: R) > [R] Crash/bad data in converting Arrow list struct

[jira] [Updated] (ARROW-7740) [R] Crash/bad data in converting Arrow list struct type

2020-03-31 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-7740: -- Priority: Critical (was: Minor) > [R] Crash/bad data in converting Arrow list

[jira] [Commented] (ARROW-8213) [Python][Dataset] Opening a dataset with a local incorrect path gives confusing error message

2020-03-31 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072174#comment-17072174 ] Francois Saint-Jacques commented on ARROW-8213: --- Make the default constructor method always

[jira] [Updated] (ARROW-8213) [Python][Dataset] Opening a dataset with a local incorrect path gives confusing error message

2020-03-31 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8213: -- Component/s: C++ - Dataset > [Python][Dataset] Opening a dataset with a local

[jira] [Commented] (ARROW-8282) [C++/Python][Dataset] Support schema evolution for integer columns

2020-03-31 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072178#comment-17072178 ] Francois Saint-Jacques commented on ARROW-8282: --- Once we have instanciated Fragment, we can

[jira] [Updated] (ARROW-8065) [C++][Dataset] Untangle Dataset, Fragment and ScanOptions

2020-03-30 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8065: -- Description: Currently: a fragment is a product of a scan; it is a lazy

[jira] [Closed] (ARROW-6953) [C++][Dataset] Implement Gandiva Filter/Projector in Scanner

2020-04-01 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques closed ARROW-6953. - Resolution: Won't Fix This will be in the compute engine > [C++][Dataset]

[jira] [Assigned] (ARROW-8005) [Website] Review and adjust any usages of Apache dist system from website / tools

2020-04-01 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-8005: - Assignee: Francois Saint-Jacques > [Website] Review and adjust any

[jira] [Created] (ARROW-8318) [C++][Dataset] Dataset should instantiate Fragment

2020-04-02 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8318: - Summary: [C++][Dataset] Dataset should instantiate Fragment Key: ARROW-8318 URL: https://issues.apache.org/jira/browse/ARROW-8318 Project: Apache

[jira] [Assigned] (ARROW-8244) [Python][Parquet] Add `write_to_dataset` option to populate the "file_path" metadata fields

2020-04-02 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-8244: - Assignee: Joris Van den Bossche > [Python][Parquet] Add

[jira] [Commented] (ARROW-3388) [Python] boolean Partition keys in ParquetDataset are reconstructed as string

2020-04-28 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094518#comment-17094518 ] Francois Saint-Jacques commented on ARROW-3388: --- I think it's normal to require a schema.

[jira] [Created] (ARROW-8601) [Go][Flight] Implement Flight Writer interface

2020-04-27 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8601: - Summary: [Go][Flight] Implement Flight Writer interface Key: ARROW-8601 URL: https://issues.apache.org/jira/browse/ARROW-8601 Project: Apache Arrow

[jira] [Updated] (ARROW-8604) [R] Windows compilation failure

2020-04-27 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8604: -- Description: [Master|[https://github.com/apache/arrow/runs/622393526]] fails

[jira] [Created] (ARROW-8602) [CMake] Fix ws2_32 link issue when cross-compiling on Linux

2020-04-27 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8602: - Summary: [CMake] Fix ws2_32 link issue when cross-compiling on Linux Key: ARROW-8602 URL: https://issues.apache.org/jira/browse/ARROW-8602 Project:

<    3   4   5   6   7   8   9   10   >