date:20220720

[jira] [Created] (ARROW-17154) [C++] Change cmake project name from arrow_python to pyarrow_cpp

2022-07-20 Thread Alenka Frim (Jira)

Alenka Frim created ARROW-17154:
---

 Summary: [C++] Change cmake project name from arrow_python to 
pyarrow_cpp
 Key: ARROW-17154
 URL: https://issues.apache.org/jira/browse/ARROW-17154
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: C++
Reporter: Alenka Frim
Assignee: Alenka Frim
 Fix For: 10.0.0


See discussion https://github.com/apache/arrow/pull/13311#discussion_r926198302



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17153) [CI][Homebrew] Require glib-utils

2022-07-20 Thread Kouhei Sutou (Jira)

Kouhei Sutou created ARROW-17153:


 Summary: [CI][Homebrew] Require glib-utils
 Key: ARROW-17153
 URL: https://issues.apache.org/jira/browse/ARROW-17153
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration, GLib
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou
 Fix For: 9.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17152) [Docs] Enable dark mode on documentation site

2022-07-20 Thread Will Jones (Jira)

Will Jones created ARROW-17152:
--

 Summary: [Docs] Enable dark mode on documentation site
 Key: ARROW-17152
 URL: https://issues.apache.org/jira/browse/ARROW-17152
 Project: Apache Arrow
  Issue Type: New Feature
Reporter: Will Jones
 Fix For: 10.0.0
 Attachments: Screen Shot 2022-07-20 at 3.10.51 PM.png, Screen Shot 
2022-07-20 at 3.12.18 PM.png

pydata-sphinx-theme adds dark mode in version 0.9.0. We will need to adapt our 
logo ([see 
docs|https://pydata-sphinx-theme.readthedocs.io/en/stable/user_guide/configuring.html?highlight=dark#different-logos-for-light-and-dark-mode]).
 There are also some places in the docs where we may need to adjust additional 
CSS. See attached screenshot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17151) [Docs] Pin pydata-sphinx-theme to 0.8 to avoid dark mode

2022-07-20 Thread Will Jones (Jira)

Will Jones created ARROW-17151:
--

 Summary: [Docs] Pin pydata-sphinx-theme to 0.8 to avoid dark mode
 Key: ARROW-17151
 URL: https://issues.apache.org/jira/browse/ARROW-17151
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Documentation
Reporter: Will Jones
 Fix For: 9.0.0


pydata-sphinx-theme introduced automatic dark mode. However there is a series 
of changes we need to do (such as providing a dark-mode Arrow logo) before we 
will be ready for this. For the 9.0.0 release, we should instead pin to the 
version of pydata-sphinx-theme just before that release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17150) [R] Allow statically linked libcurl in GCS when building libarrow DLL in RTools

2022-07-20 Thread Will Jones (Jira)

Will Jones created ARROW-17150:
--

 Summary: [R] Allow statically linked libcurl in GCS when building 
libarrow DLL in RTools
 Key: ARROW-17150
 URL: https://issues.apache.org/jira/browse/ARROW-17150
 Project: Apache Arrow
  Issue Type: Improvement
  Components: R
Affects Versions: 9.0.0
Reporter: Will Jones
 Fix For: 10.0.0


Neal's patch in ARROW-16510 enabled libcurl to be linked statically in the 
google cloud storage dependency, but this only seems to work for static 
libraries on RTools (Windows). For development Rtools environments, we 
currently use dynamic Arrow libraries instead, but currently we get linking 
errors to libcurl when ARROW_GCS is on.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17149) [R] Enable GCS tests for Windows

2022-07-20 Thread Will Jones (Jira)

Will Jones created ARROW-17149:
--

 Summary: [R] Enable GCS tests for Windows
 Key: ARROW-17149
 URL: https://issues.apache.org/jira/browse/ARROW-17149
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration, R
Affects Versions: 9.0.0
Reporter: Will Jones
 Fix For: 10.0.0


In ARROW-16879, I found the GCS tests were hanging in CI, but couldn't diagnose 
why. We should solve that and enable the tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [arrow-adbc] lidavidm merged pull request #41: [Python] Complete minimal bindings for ADBC

2022-07-20 Thread GitBox



lidavidm merged PR #41:
URL: https://github.com/apache/arrow-adbc/pull/41


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow-adbc] lidavidm closed issue #37: Reorganize and complete Python bindings

2022-07-20 Thread GitBox



lidavidm closed issue #37: Reorganize and complete Python bindings
URL: https://github.com/apache/arrow-adbc/issues/37


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (ARROW-17148) [R] Improve evaluation of R functions from C++

2022-07-20 Thread Dewey Dunnington (Jira)

Dewey Dunnington created ARROW-17148:


 Summary: [R] Improve evaluation of R functions from C++
 Key: ARROW-17148
 URL: https://issues.apache.org/jira/browse/ARROW-17148
 Project: Apache Arrow
  Issue Type: Improvement
  Components: R
Reporter: Dewey Dunnington


There are currently a few places where we call R code from C++ (and after 
ARROW-16444 and ARROW-16703 we will have some more where the overhead of 
calling into R might be greater than the time it takes to actually evaluate the 
function/the functions will be called in a tight loop).

The current approach uses {{cpp11::function}}. This is totally fine and safe 
but generates some ugly backtraces on error and is potentially slower than the 
lean-and-mean approach of purrr (whose entire job is to call R functions in a 
loop and has been heavily optimized). The purrr approach is to construct the 
{{call()}} and calling environment in advance and then just run `Rf_eval(call, 
env)` in the loop. This is both faster (fewer R API calls) and generates better 
backtraces (e.g., {{Error in fun(arg1, arg2)}} instead of {{Error in 
(function(a, b) { ...the whole content of the function ... })(every, deparsed, 
argument)}}.

Before optimizing that heavily we should of course benchmark to see exactly how 
much that matters!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17147) [R] parse_date_time should support locale parameter

2022-07-20 Thread Rok Mihevc (Jira)

Rok Mihevc created ARROW-17147:
--

 Summary: [R] parse_date_time should support locale parameter
 Key: ARROW-17147
 URL: https://issues.apache.org/jira/browse/ARROW-17147
 Project: Apache Arrow
  Issue Type: Improvement
  Components: R
Reporter: Rok Mihevc


See [discussion 
here|https://github.com/apache/arrow/pull/13627#discussion_r924875872].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17146) [R] parse_date_time should support quiet = FALSE

2022-07-20 Thread Rok Mihevc (Jira)

Rok Mihevc created ARROW-17146:
--

 Summary: [R] parse_date_time should support quiet = FALSE
 Key: ARROW-17146
 URL: https://issues.apache.org/jira/browse/ARROW-17146
 Project: Apache Arrow
  Issue Type: Improvement
  Components: R
Reporter: Rok Mihevc


See [discussion 
here|https://github.com/apache/arrow/pull/13627#discussion_r924875872].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [arrow-adbc] lidavidm opened a new pull request, #41: [Python] Complete minimal bindings for ADBC

2022-07-20 Thread GitBox



lidavidm opened a new pull request, #41:
URL: https://github.com/apache/arrow-adbc/pull/41

   Also refactors the bindings to not depend on PyArrow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (ARROW-17145) [C++] Compilation warnings on gcc in release mode

2022-07-20 Thread Antoine Pitrou (Jira)

Antoine Pitrou created ARROW-17145:
--

 Summary: [C++] Compilation warnings on gcc in release mode
 Key: ARROW-17145
 URL: https://issues.apache.org/jira/browse/ARROW-17145
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Antoine Pitrou


With gcc 10.3 I get this warning in release mode.

{code}
[168/321] Building CXX object 
src/arrow/CMakeFiles/arrow_testing_objlib.dir/compute/exec/test_util.cc.o
In file included from 
/home/antoine/arrow/dev/cpp/src/arrow/compute/exec/test_util.h:28,
 from 
/home/antoine/arrow/dev/cpp/src/arrow/compute/exec/test_util.cc:18:
/home/antoine/arrow/dev/cpp/src/arrow/compute/exec.h: In member function 'R 
arrow::internal::FnOnce::FnImpl::invoke(A&& ...) [with Fn = 
arrow::Future<>::WrapResultyOnComplete::Callback::ThenOnComplete
 >)::, 
arrow::Future<>::PassthruOnFailure
 >):: > > >; R = void; A = {const arrow::FutureImpl&}]':
/home/antoine/arrow/dev/cpp/src/arrow/compute/exec.h:177:21: warning: 
'*((void*)(&)+8).arrow::compute::ExecBatch::length' may be used 
uninitialized in this function [-Wmaybe-uninitialized]
  177 | struct ARROW_EXPORT ExecBatch {
  | ^
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17144) Adding sqrt Function

2022-07-20 Thread Sahaj Gupta (Jira)

Sahaj Gupta created ARROW-17144:
---

 Summary: Adding sqrt Function
 Key: ARROW-17144
 URL: https://issues.apache.org/jira/browse/ARROW-17144
 Project: Apache Arrow
  Issue Type: New Feature
Reporter: Sahaj Gupta


Adding Sqrt Function.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17143) [R] Add examples working with `tidyr::unnest`and `tidyr::unnest_longer`

2022-07-20 Thread SHIMA Tatsuya (Jira)

SHIMA Tatsuya created ARROW-17143:
-

 Summary: [R] Add examples working with `tidyr::unnest`and 
`tidyr::unnest_longer`
 Key: ARROW-17143
 URL: https://issues.apache.org/jira/browse/ARROW-17143
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Documentation, R
Affects Versions: 8.0.1
Reporter: SHIMA Tatsuya


Related to ARROW-8813

The arrow package can convert json files to data frames very easily, but 
{{tidyr::unnest_longer}} is needed for array expansion.
Wonder if {{tidyr}} could be added to the recommended package and examples like 
this could be included in the documentation and test cases.

{code:r}
tf <- tempfile()
on.exit(unlink(tf))
writeLines('
{ "hello": 3.5, "world": false, "foo": { "bar": [ 1, 2 ] } }
{ "hello": 3.25, "world": null }
{ "hello": 0.0, "world": true, "foo": { "bar": [ 3, 4, 5 ] } }
  ', tf)

arrow::read_json_arrow(tf) |>
  tidyr::unnest(foo, names_sep = ".") |>
  tidyr::unnest_longer(foo.bar)
#> # A tibble: 6 × 3
#>   hello world foo.bar
#>   
#> 1  3.5  FALSE   1
#> 2  3.5  FALSE   2
#> 3  3.25 NA NA
#> 4  0TRUE3
#> 5  0TRUE4
#> 6  0TRUE5
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17142) `equals` method on Parquet Metadata segfaults when passed `None

2022-07-20 Thread Kshiteej K (Jira)

Kshiteej K created ARROW-17142:
--

 Summary: `equals` method on Parquet Metadata segfaults when passed 
`None
 Key: ARROW-17142
 URL: https://issues.apache.org/jira/browse/ARROW-17142
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Kshiteej K


 
{code:java}
import pyarrow as pa import pyarrow.parquet as pq

table = pa.table({"a": [1, 2, 3]}) 

# Here metadata is None
metadata = table.schema.metadata

fname = "data.parquet"
pq.write_table(table, fname) # Get `metadata`.
r_metadata = pq.read_metadata(fname)

# Equals on Metadata segfaults when passed None
r_metadata.equals(metadata) {code}
 
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17141) [C++] Enable selecting nested fields in StructArray with field path

2022-07-20 Thread Rok Mihevc (Jira)

Rok Mihevc created ARROW-17141:
--

 Summary: [C++] Enable selecting nested fields in StructArray with 
field path
 Key: ARROW-17141
 URL: https://issues.apache.org/jira/browse/ARROW-17141
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Rok Mihevc


Currently selecting a nested field in a StructArray requires multiple selects 
or flattening of schema. It would be more user friendly to provide a field path 
e.g.: field_in_top_struct.field_in_substruct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17140) Adding Floor Function

2022-07-20 Thread Sahaj Gupta (Jira)

Sahaj Gupta created ARROW-17140:
---

 Summary: Adding Floor Function
 Key: ARROW-17140
 URL: https://issues.apache.org/jira/browse/ARROW-17140
 Project: Apache Arrow
  Issue Type: New Feature
Reporter: Sahaj Gupta


Adding Floor Function



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17139) [Python] Add field() method to get field from StructType

2022-07-20 Thread Joris Van den Bossche (Jira)

Joris Van den Bossche created ARROW-17139:
-

 Summary: [Python] Add field() method to get field from StructType
 Key: ARROW-17139
 URL: https://issues.apache.org/jira/browse/ARROW-17139
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Joris Van den Bossche


>From ARROW-17047:

We could also add a {{field()}} method to {{StructType}} that returns you a 
field? (that is more discoverable than [], and would be consistent with a 
Schema and with StructArray (to get the child array for that field))



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17137) [Python] Converting data frame to Table with large nested column fails `Invalid Struct child array has length smaller than expected`

2022-07-20 Thread Jira

Simon Weiß created ARROW-17137:
--

 Summary: [Python] Converting data frame to Table with large nested 
column fails `Invalid Struct child array has length smaller than expected`
 Key: ARROW-17137
 URL: https://issues.apache.org/jira/browse/ARROW-17137
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Simon Weiß


Hey, 

I have a data frame for which one column is a nested struct array. Converting 
it to a `pyarrow.Table` fails if the data frame gets too big. I could reproduce 
the bug with a minimal example with anonymized data that is roughly similar to 
mine. When I set, e.g., `N_ROWS=500_000`, or smaller, it is working fine.

```python

import pandas as pd
import pyarrow as pa

N_ROWS = 800_000
item_record = {
    "someImportantAssets": [
        {
            "square": 
"https://some.super.loong.link.com/withmany/lorem/upload/ipsum/stilllooonger/lorem/\{someparameter}/156fdjjf644984dfdfaera648/specificLink-i15348891;
        }
    ],
    "id": "i15348891",
    "title": "Some Long Item Title i15348891",
}

user_record = {
    "userId": "faa4648-4964drf-64648fafa648-4648falj",
    "data": [item_record for _ in range(24)],
}

df = pd.DataFrame([user_record for _ in range(N_ROWS)])
table = pa.Table.from_pandas(df)

```

```python-traceback
Traceback (most recent call last):
    table = pa.Table.from_pandas(df)
  File "pyarrow/table.pxi", line 1658, in pyarrow.lib.Table.from_pandas
  File "pyarrow/table.pxi", line 1702, in pyarrow.lib.Table.from_arrays
  File "pyarrow/table.pxi", line 1314, in pyarrow.lib.Table.validate
  File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Column 1: In chunk 0: Invalid: List child array 
invalid: Invalid: Struct child array #1 invalid: Invalid: List child array 
invalid: Invalid: Struct child array #0 has length smaller than expected for 
struct array (13256071 < 13256072)
```

The length is always smaller than expected by 1.

 
h2. Expected behavior:

Run without errors or fail with a better error message.

 
h2. System Info and Versions:

Apple M1 Pro but also happened on amd64 Linux machine on AWS

```
arrow-cpp                 7.0.0           py39h8a997f0_8_cpu    conda-forge
pyarrow                   7.0.0           py39h3a11367_8_cpu    conda-forge

python                    3.9.7           h54d631c_3_cpython    conda-forge
```

I could also reproduce with `pyarrow 8.0.0`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17138) [Python] Converting data frame to Table with large nested column fails `Invalid Struct child array has length smaller than expected`

2022-07-20 Thread Jira

Simon Weiß created ARROW-17138:
--

 Summary: [Python] Converting data frame to Table with large nested 
column fails `Invalid Struct child array has length smaller than expected`
 Key: ARROW-17138
 URL: https://issues.apache.org/jira/browse/ARROW-17138
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Simon Weiß


Hey, 

I have a data frame for which one column is a nested struct array. Converting 
it to a `pyarrow.Table` fails if the data frame gets too big. I could reproduce 
the bug with a minimal example with anonymized data that is roughly similar to 
mine. When I set, e.g., `N_ROWS=500_000`, or smaller, it is working fine.

```python

import pandas as pd
import pyarrow as pa

N_ROWS = 800_000
item_record = {
    "someImportantAssets": [
        {
            "square": 
"https://some.super.loong.link.com/withmany/lorem/upload/ipsum/stilllooonger/lorem/\{someparameter}/156fdjjf644984dfdfaera648/specificLink-i15348891;
        }
    ],
    "id": "i15348891",
    "title": "Some Long Item Title i15348891",
}

user_record = {
    "userId": "faa4648-4964drf-64648fafa648-4648falj",
    "data": [item_record for _ in range(24)],
}

df = pd.DataFrame([user_record for _ in range(N_ROWS)])
table = pa.Table.from_pandas(df)

```

```python-traceback
Traceback (most recent call last):
    table = pa.Table.from_pandas(df)
  File "pyarrow/table.pxi", line 1658, in pyarrow.lib.Table.from_pandas
  File "pyarrow/table.pxi", line 1702, in pyarrow.lib.Table.from_arrays
  File "pyarrow/table.pxi", line 1314, in pyarrow.lib.Table.validate
  File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Column 1: In chunk 0: Invalid: List child array 
invalid: Invalid: Struct child array #1 invalid: Invalid: List child array 
invalid: Invalid: Struct child array #0 has length smaller than expected for 
struct array (13256071 < 13256072)
```

The length is always smaller than expected by 1.

 
h2. Expected behavior:

Run without errors or fail with a better error message.

 
h2. System Info and Versions:

Apple M1 Pro but also happened on amd64 Linux machine on AWS

```
arrow-cpp                 7.0.0           py39h8a997f0_8_cpu    conda-forge
pyarrow                   7.0.0           py39h3a11367_8_cpu    conda-forge

python                    3.9.7           h54d631c_3_cpython    conda-forge
```

I could also reproduce with `pyarrow 8.0.0`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ARROW-17136) open_append_stream throwing an error if file does not exists

2022-07-20 Thread Sagar Shinde (Jira)

Sagar Shinde created ARROW-17136:


 Summary: open_append_stream throwing an error if file does not 
exists
 Key: ARROW-17136
 URL: https://issues.apache.org/jira/browse/ARROW-17136
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 8.0.0
Reporter: Sagar Shinde


as per the document method, open_append_stream will create the file if does not 
exists. But when I try to append to the file in hdfs it is throwing an error 
like file, not found.



hdfsOpenFile(/tmp/xyz.json): 
FileSystem#append((Lorg/apache/hadoop/fs/Path;)Lorg/apache/hadoop/fs/FSDataOutputStream;)
 error:
RemoteException: Failed to append to non-existent file /tmp/xyz.json for client 
10.128.8.11
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:104)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2639)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:805)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:487)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
java.io.FileNotFoundException: Failed to append to non-existent file 
/tmp/xyz.json for client 10.128.8.11
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:104)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2639)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:805)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:487)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
        at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
        at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1367)
        at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1424)
        at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1394)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$5.doCall(DistributedFileSystem.java:423)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$5.doCall(DistributedFileSystem.java:419)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:431)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:400)
        at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1386)
Caused by: 
org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Failed to 
append to non-existent file /tmp/xyz.json for client 10.128.8.11
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:104)
        at

[jira] [Created] (ARROW-17154) [C++] Change cmake project name from arrow_python to pyarrow_cpp

[jira] [Created] (ARROW-17153) [CI][Homebrew] Require glib-utils

[jira] [Created] (ARROW-17152) [Docs] Enable dark mode on documentation site

[jira] [Created] (ARROW-17151) [Docs] Pin pydata-sphinx-theme to 0.8 to avoid dark mode

[jira] [Created] (ARROW-17150) [R] Allow statically linked libcurl in GCS when building libarrow DLL in RTools

[jira] [Created] (ARROW-17149) [R] Enable GCS tests for Windows

[GitHub] [arrow-adbc] lidavidm merged pull request #41: [Python] Complete minimal bindings for ADBC

[GitHub] [arrow-adbc] lidavidm closed issue #37: Reorganize and complete Python bindings

[jira] [Created] (ARROW-17148) [R] Improve evaluation of R functions from C++

[jira] [Created] (ARROW-17147) [R] parse_date_time should support locale parameter

[jira] [Created] (ARROW-17146) [R] parse_date_time should support quiet = FALSE

[GitHub] [arrow-adbc] lidavidm opened a new pull request, #41: [Python] Complete minimal bindings for ADBC

[jira] [Created] (ARROW-17145) [C++] Compilation warnings on gcc in release mode

[jira] [Created] (ARROW-17144) Adding sqrt Function

[jira] [Created] (ARROW-17143) [R] Add examples working with `tidyr::unnest`and `tidyr::unnest_longer`

[jira] [Created] (ARROW-17142) `equals` method on Parquet Metadata segfaults when passed `None

[jira] [Created] (ARROW-17141) [C++] Enable selecting nested fields in StructArray with field path

[jira] [Created] (ARROW-17140) Adding Floor Function

[jira] [Created] (ARROW-17139) [Python] Add field() method to get field from StructType

[jira] [Created] (ARROW-17137) [Python] Converting data frame to Table with large nested column fails `Invalid Struct child array has length smaller than expected`

[jira] [Created] (ARROW-17138) [Python] Converting data frame to Table with large nested column fails `Invalid Struct child array has length smaller than expected`

[jira] [Created] (ARROW-17136) open_append_stream throwing an error if file does not exists

22 matches

Site Navigation

Mail list logo

Footer information