[GitHub] [arrow-julia] kou closed issue #301: Release script publishes the artifacts to wrong URL

2022-03-07 Thread GitBox


kou closed issue #301:
URL: https://github.com/apache/arrow-julia/issues/301


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-julia] kou opened a new issue #301: Release script publishes the artifacts to wrong URL

2022-03-07 Thread GitBox


kou opened a new issue #301:
URL: https://github.com/apache/arrow-julia/issues/301


   It publishes to 
https://dist.apache.org/repos/dist/release/arrow/apache-arrow-julia-X.Y.Z but 
we should remove "apache-" prefix because other release doesn't have "apache-" 
prefix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (ARROW-15865) [Java][Doc]: Configure local maven to consume github arrow nightly assets

2022-03-07 Thread David Dali Susanibar Arce (Jira)
David Dali Susanibar Arce created ARROW-15865:
-

 Summary: [Java][Doc]: Configure local maven to consume github 
arrow nightly assets
 Key: ARROW-15865
 URL: https://issues.apache.org/jira/browse/ARROW-15865
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Java
Reporter: David Dali Susanibar Arce
Assignee: David Dali Susanibar Arce


Current maven configuration to integrate with github assets repository:
{code:java}

http://maven.apache.org/SETTINGS/1.1.0 
http://maven.apache.org/xsd/settings-1.1.0.xsd; 
xmlns="http://maven.apache.org/SETTINGS/1.1.0;
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;>
  
    
      
        
           staged
           staged-releases
           
https://repository.apache.org/content/repositories/staging/
           
             true
           
           
             true
             never
           
         
      
      arrowrc
    
    
      
        
           staged
           staged-releases
           
https://github.com/ursacomputing/crossbow/releases/tag/release-7.0.0-rc10-0-github-java-jars
           
             true
           
           
             true
           
         
      
      arrownightly
    
  
 {code}
Run with "mvn -Parrownightly clean install" its download files to .m2 local 
repository but as a invalid jar/pom files

Define a way about how to integrate maven with current github assets repository 
to download assets properly without errors



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (ARROW-15864) [Java][Doc]: Arrow java nightly build

2022-03-07 Thread David Dali Susanibar Arce (Jira)
David Dali Susanibar Arce created ARROW-15864:
-

 Summary: [Java][Doc]: Arrow java nightly build
 Key: ARROW-15864
 URL: https://issues.apache.org/jira/browse/ARROW-15864
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Java
Reporter: David Dali Susanibar Arce
Assignee: David Dali Susanibar Arce


Current java artifacts nightly build are uploaded to github as an assets.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (ARROW-15863) [Packaging][C++][Python] Conda package build failure

2022-03-07 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-15863:
--

 Summary: [Packaging][C++][Python] Conda package build failure
 Key: ARROW-15863
 URL: https://issues.apache.org/jira/browse/ARROW-15863
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, Packaging, Python
Reporter: Antoine Pitrou


The Windows conda package builds are failing:
https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=20856=logs=4c86bc1b-1091-5192-4404-c74dfaad23e7=1e0e7149-0c33-565b-af41-050b54dd61be



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [arrow-julia] nilshg opened a new issue #300: Possible bug in `Any` concretization routine

2022-03-07 Thread GitBox


nilshg opened a new issue #300:
URL: https://github.com/apache/arrow-julia/issues/300


   As discussed on Slack:
   
   ```
   julia> using Arrow, DataFrames
   
   julia> Arrow.write("test.arrow", (a = [1, 2], b = Any[3, 4.5]))
   "test.arrow"
   
   julia> DataFrame(Arrow.Table("test.arrow"))
   2×2 DataFrame
Row │ a  b
│ Int64  Float64  
   ─┼─
  1 │ 1  1.5e-323
  2 │ 2  4.5
   
   (jl_aYpToJ) pkg> st
 Status `C:\Users\ngudat\AppData\Local\Temp\jl_aYpToJ\Project.toml`
 [69666777] Arrow v2.2.0
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (ARROW-15862) [R][C++] Provide a way to go from integer to duration

2022-03-07 Thread Jira
Dragoș Moldovan-Grünfeld created ARROW-15862:


 Summary: [R][C++] Provide a way to go from integer to duration
 Key: ARROW-15862
 URL: https://issues.apache.org/jira/browse/ARROW-15862
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, R
Reporter: Dragoș Moldovan-Grünfeld


Currently it is not possible to directly create a duration object from a 
numeric one (for example through casting).
{code:r}
library(arrow)

a <- Array$create(32L)
a$cast(duration("s"))
#> Error: NotImplemented: Unsupported cast from int32 to duration using 
function cast_duration
#> /Users/dragos/Documents/arrow/cpp/src/arrow/compute/function.cc:231  
DispatchBest()
{code}

This underpins a lot of the date-time arithmetic in R, which support the 
conversion/ coercion of an integer to difftime (R's equivalent for duration), 
such as in the pipeline below.
{code:r}
library(arrow, warn.conflicts = FALSE)
#> See arrow_info() for available features
library(dplyr, warn.conflicts = FALSE)
library(lubridate, warn.conflicts = FALSE)

df <- tibble(time = as_datetime(c("2022-03-07 15:00:28", "2022-03-06 
14:00:28"))) 
df
#> # A tibble: 2 × 1
#>   time   
#>
#> 1 2022-03-07 15:00:28
#> 2 2022-03-06 14:00:28

df %>% 
  mutate(time2 = time + seconds(2))
#> # A tibble: 2 × 2
#>   timetime2  
#>  
#> 1 2022-03-07 15:00:28 2022-03-07 15:00:30
#> 2 2022-03-06 14:00:28 2022-03-06 14:00:30
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (ARROW-15861) [Java][Flight] grpc-netty, version mismatch, incompatible ctor for "PooledByteBufAllocator" in io.grpc.netty.Utils#createByteBufAllocator

2022-03-07 Thread Gavin Ray (Jira)
Gavin Ray created ARROW-15861:
-

 Summary: [Java][Flight] grpc-netty, version mismatch, incompatible 
ctor for "PooledByteBufAllocator" in io.grpc.netty.Utils#createByteBufAllocator 
 Key: ARROW-15861
 URL: https://issues.apache.org/jira/browse/ARROW-15861
 Project: Apache Arrow
  Issue Type: Bug
  Components: FlightRPC, Java
Affects Versions: 8.0.0
Reporter: Gavin Ray
 Attachments: image-2022-03-07-10-47-09-355.png

Using Arrow nightly jars from 03/03/2022

{code:java}
val LOCALHOST = "localhost"
val allocator = RootAllocator(Long.MAX_VALUE)
val serverLocation = Location.forGrpcInsecure(LOCALHOST, 0)
val producer = DataWrapperFlightSQLProducer(serverLocation)
val server = FlightServer.builder(allocator, serverLocation, 
producer).build().start()
val clientLocation = Location.forGrpcInsecure(LOCALHOST, server.port)
val client = FlightSqlClient(FlightClient.builder(allocator, 
clientLocation).build())
{code}

This throws the following error (from "FlightServer.builder")
{code:java}
'void io.netty.buffer.PooledByteBufAllocator.(boolean, int, int, int, 
int, int, int, boolean)'
java.lang.NoSuchMethodError: 'void 
io.netty.buffer.PooledByteBufAllocator.(boolean, int, int, int, int, int, 
int, boolean)'
at io.grpc.netty.Utils.createByteBufAllocator(Utils.java:176)
at io.grpc.netty.Utils.access$000(Utils.java:75)
at 
io.grpc.netty.Utils$ByteBufAllocatorPreferDirectHolder.(Utils.java:97)
at io.grpc.netty.Utils.getByteBufAllocator(Utils.java:144)
at io.grpc.netty.NettyServer.start(NettyServer.java:205)
at io.grpc.internal.ServerImpl.start(ServerImpl.java:183)
at io.grpc.internal.ServerImpl.start(ServerImpl.java:92)
at org.apache.arrow.flight.FlightServer.start(FlightServer.java:83)
at 
FlightSQLServerAndClientTest.(FlightSQLServerAndClientTest.kt:33)
{code}

The reason is because the constructor is incompatible:

 !image-2022-03-07-10-47-09-355.png! 

To fix this, you can override Arrow's dependencies versions:
{code:groovy}
implementation("io.grpc", "grpc-netty").version {
strictly("1.44.1")
}
implementation("io.netty", "netty-all").version {
strictly("4.1.74.Final")
}
implementation("io.netty", "netty-codec").version {
strictly("4.1.74.Final")
}
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (ARROW-15860) [Python][Docs] Document RecordBatchReader

2022-03-07 Thread Will Jones (Jira)
Will Jones created ARROW-15860:
--

 Summary: [Python][Docs] Document RecordBatchReader
 Key: ARROW-15860
 URL: https://issues.apache.org/jira/browse/ARROW-15860
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Documentation, Python
Affects Versions: 7.0.0
Reporter: Will Jones
 Fix For: 8.0.0


RecordBatchReader seems like a pretty important type, but it is missing from 
the Python API docs.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (ARROW-15859) [C++] Add nightly test for static build with arrow_flight_static and arrow_bundled_dependencies

2022-03-07 Thread Rok Mihevc (Jira)
Rok Mihevc created ARROW-15859:
--

 Summary: [C++] Add nightly test for static build with 
arrow_flight_static and arrow_bundled_dependencies
 Key: ARROW-15859
 URL: https://issues.apache.org/jira/browse/ARROW-15859
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Continuous Integration
Reporter: Rok Mihevc


Due to abseil dependencies static builds with arrow_bundled_dependencies are 
brittle. We could test them nightly with the example proposed in ARROW-14708.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (ARROW-15858) [R][C++] Support duration creation from integer

2022-03-07 Thread Jira
Dragoș Moldovan-Grünfeld created ARROW-15858:


 Summary: [R][C++] Support duration creation from integer
 Key: ARROW-15858
 URL: https://issues.apache.org/jira/browse/ARROW-15858
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, R
Reporter: Dragoș Moldovan-Grünfeld


I would expect both {{a}} and {{b}} to create a {{duration}} object of 32 
seconds, but the second one returns an {{int32}}
{code:r}
library(arrow, warn.conflicts = FALSE)

a <- as.difftime(32, units = "secs")
b <- as.difftime(32L, units = "secs")

Array$create(a)
#> Array
#> 
#> [
#>   32
#> ]
Array$create(b)
#> Array
#> 
#> [
#>   32
#> ]
{code}
If I try to be explicit, I get somewhat of a clue why that might be happening:
{code:r}
 
Array$create(a, type = duration())
#> Array
#> 
#> [
#>   32
#> ]
Array$create(b, type = duration())
#> Error:
#> ! NotImplemented: Extend
{code}
Nevertheless, the fallback to creating an integer was unexpected.

Also, not sure if this is a bug, an improvement or a new feature.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (ARROW-15857) [R] rhub/fedora-clang-devel fails to install 'sass' (rmarkdown dependency)

2022-03-07 Thread Dewey Dunnington (Jira)
Dewey Dunnington created ARROW-15857:


 Summary: [R] rhub/fedora-clang-devel fails to install 'sass' 
(rmarkdown dependency)
 Key: ARROW-15857
 URL: https://issues.apache.org/jira/browse/ARROW-15857
 Project: Apache Arrow
  Issue Type: Bug
  Components: R
Reporter: Dewey Dunnington


Starting 2022-03-03, we get a failure on the rhub/fedora-clang-devel nightly 
build. It seems to be a linking error but nothing in the sass package seems to 
have changed for some time (last update May 2021).

https://github.com/ursacomputing/crossbow/runs/5444005154?check_suite_focus=true#step:5:3007

Build log for the sass package:

{noformat}
#14 1099.2 make[1]: Entering directory 
'/tmp/RtmpvEMraB/R.INSTALL555d42b8f18e/sass/src'
#14 1099.2 /opt/R-devel/lib64/R/share/make/shlib.mk:18: warning: overriding 
recipe for target 'shlib-clean'
#14 1099.2 Makevars:12: warning: ignoring old recipe for target 'shlib-clean'
#14 1099.2 /usr/bin/clang -I"/opt/R-devel/lib64/R/include" -DNDEBUG 
-I./libsass/include  -I/usr/local/include   -fpic  -g -O2  -c compile.c -o 
compile.o
#14 1099.2 /usr/bin/clang -I"/opt/R-devel/lib64/R/include" -DNDEBUG 
-I./libsass/include  -I/usr/local/include   -fpic  -g -O2  -c init.c -o init.o
#14 1099.2 MAKEFLAGS= CC="/usr/bin/clang" CFLAGS="-g -O2 " 
CXX="/usr/bin/clang++ -std=gnu++14 -stdlib=libc++" AR="ar" 
LDFLAGS="-L/usr/local/lib64" make -C libsass
#14 1099.2 make[2]: Entering directory 
'/tmp/RtmpvEMraB/R.INSTALL555d42b8f18e/sass/src/libsass'
#14 1099.2 /usr/bin/clang -g -O2  -O2 -I ./include  -fPIC -c -o src/cencode.o 
src/cencode.c
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/ast.o src/ast.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/ast_values.o src/ast_values.cpp
#14 1099.2 src/ast_values.cpp:484:23: warning: loop variable 'numerator' 
creates a copy from type 'const std::__1::basic_string, std::__1::allocator>' 
[-Wrange-loop-construct]
#14 1099.2   for (const auto numerator : numerators)
#14 1099.2   ^
#14 1099.2 src/ast_values.cpp:484:12: note: use reference type 'const 
std::__1::basic_string, 
std::__1::allocator> &' to prevent copying
#14 1099.2   for (const auto numerator : numerators)
#14 1099.2^~
#14 1099.2   &
#14 1099.2 src/ast_values.cpp:486:23: warning: loop variable 'denominator' 
creates a copy from type 'const std::__1::basic_string, std::__1::allocator>' 
[-Wrange-loop-construct]
#14 1099.2   for (const auto denominator : denominators)
#14 1099.2   ^
#14 1099.2 src/ast_values.cpp:486:12: note: use reference type 'const 
std::__1::basic_string, 
std::__1::allocator> &' to prevent copying
#14 1099.2   for (const auto denominator : denominators)
#14 1099.2^~~~
#14 1099.2   &
#14 1099.2 2 warnings generated.
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/ast_supports.o src/ast_supports.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/ast_sel_cmp.o src/ast_sel_cmp.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/ast_sel_unify.o src/ast_sel_unify.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/ast_sel_super.o src/ast_sel_super.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/ast_sel_weave.o src/ast_sel_weave.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/ast_selectors.o src/ast_selectors.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/context.o src/context.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/constants.o src/constants.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/fn_utils.o src/fn_utils.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/fn_miscs.o src/fn_miscs.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/fn_maps.o src/fn_maps.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/fn_lists.o src/fn_lists.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 
./include  -fPIC -c -o src/fn_colors.o src/fn_colors.cpp
#14 1099.2 /usr/bin/clang++ -std=gnu++14 -stdlib=libc++ -Wall -O2 -std=c++11 -I 

[jira] [Created] (ARROW-15856) [R] S3FileSystem - open_dataset

2022-03-07 Thread Martin du Toit (Jira)
Martin du Toit created ARROW-15856:
--

 Summary: [R] S3FileSystem - open_dataset
 Key: ARROW-15856
 URL: https://issues.apache.org/jira/browse/ARROW-15856
 Project: Apache Arrow
  Issue Type: New Feature
  Components: R
Affects Versions: 7.0.0
Reporter: Martin du Toit


Hi

 I can successfully create a S3FileSystem that connects via minio. 

I can create a SubTreeFileSystem: 
s3://investmentaccountingdata/rawdata/transactions/transactions-xxx/v1.1/

I can list the files in the SubTreeFileSystem, and I can open a dataset on from 
the list of files
{code:java}
// code placeholder
list_files <- sfs$ls(recursive=TRUE)
ds <- arrow::open_dataset(sources = list_files, schema = schema_file, format = 
csv_format, filesystem = sfs)

{code}
This all works fine, if I provide the list of files, but I want to specify a 
path higher up to be able to include the sub folders as partitions. The code I 
use works perfectly if I run it on a local disk.

How can I do open_dataset, and give a folder as source?

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (ARROW-15855) [Python]Add dictionary_pagesize_limit to Parquet writer

2022-03-07 Thread Xinyu Zeng (Jira)
Xinyu Zeng created ARROW-15855:
--

 Summary: [Python]Add dictionary_pagesize_limit to Parquet writer
 Key: ARROW-15855
 URL: https://issues.apache.org/jira/browse/ARROW-15855
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Parquet, Python
Reporter: Xinyu Zeng
 Fix For: 7.0.0


Although the python Parquet api is a wrapper of c++, there are some
tuning knobs not included in python. For example,
dictionary_pagesize_limit_. The dictionary page size will easily
exceed the limit when any or many of the following happen: 1. The
row_group_size is relatively large e.g. the default is 64M. 2. The
size per entry is large e.g large string column 3. the repeatability
of data is not so high. This may result in the dictionary encoding not
being fully utilized if this parameter cannot be tuned. In C++,
however, this parameter can be tuned to the optimized setting.

There are also other parameters not exposed in python, for example,
max_statistics_size.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)