[
https://issues.apache.org/jira/browse/ARROW-13546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17393032#comment-17393032
]
Maarten Breddels commented on ARROW-13546:
--
My current workaround, to make it backward
Maarten Breddels created ARROW-13546:
Summary: [Python] Breaking API change in FSSpecHandler, requires
metadata argument
Key: ARROW-13546
URL: https://issues.apache.org/jira/browse/ARROW-13546
[
https://issues.apache.org/jira/browse/ARROW-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17374890#comment-17374890
]
Maarten Breddels commented on ARROW-13259:
--
Does my comment
[
https://issues.apache.org/jira/browse/ARROW-12608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337317#comment-17337317
]
Maarten Breddels commented on ARROW-12608:
--
I agree a split_pattern_regex might make sense, you
[
https://issues.apache.org/jira/browse/ARROW-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332653#comment-17332653
]
Maarten Breddels commented on ARROW-12547:
--
I recommend trying without memory mapping. If IO
[
https://issues.apache.org/jira/browse/ARROW-3016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286400#comment-17286400
]
Maarten Breddels commented on ARROW-3016:
-
I'd also recommend using perf with uprobes for this,
[
https://issues.apache.org/jira/browse/ARROW-11000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17253047#comment-17253047
]
Maarten Breddels commented on ARROW-11000:
--
did you check with passing
Maarten Breddels created ARROW-10959:
Summary: [C++] Add scalar string join kernel
Key: ARROW-10959
URL: https://issues.apache.org/jira/browse/ARROW-10959
Project: Apache Arrow
Issue
[
https://issues.apache.org/jira/browse/ARROW-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251793#comment-17251793
]
Maarten Breddels commented on ARROW-10557:
--
This would be easier to implement using the tools
[
https://issues.apache.org/jira/browse/ARROW-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243437#comment-17243437
]
Maarten Breddels commented on ARROW-10799:
--
Would you mind opening a draft PR for that, in case
[
https://issues.apache.org/jira/browse/ARROW-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243422#comment-17243422
]
Maarten Breddels commented on ARROW-10799:
--
Ah yes, that implementation makes sense. I saw the
[
https://issues.apache.org/jira/browse/ARROW-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243417#comment-17243417
]
Maarten Breddels commented on ARROW-10799:
--
{code:java}
import pyarrow as pa
a = pa.array(['a']
Maarten Breddels created ARROW-10799:
Summary: [C++] Take on string chunked arrays slow and fails
Key: ARROW-10799
URL: https://issues.apache.org/jira/browse/ARROW-10799
Project: Apache Arrow
[
https://issues.apache.org/jira/browse/ARROW-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17239312#comment-17239312
]
Maarten Breddels commented on ARROW-10739:
--
Ok, good to know.
Two workarounds I came up with
[
https://issues.apache.org/jira/browse/ARROW-10736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17239305#comment-17239305
]
Maarten Breddels commented on ARROW-10736:
--
Thanks, I tried scan with an empty schema on the
Maarten Breddels created ARROW-10739:
Summary: [Python] Pickling a sliced array serializes all the
buffers
Key: ARROW-10739
URL: https://issues.apache.org/jira/browse/ARROW-10739
Project: Apache
Maarten Breddels created ARROW-10736:
Summary: [Python] feather/arrow row splitting and counting
(Dataset API)
Key: ARROW-10736
URL: https://issues.apache.org/jira/browse/ARROW-10736
Project:
[
https://issues.apache.org/jira/browse/ARROW-10709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17238106#comment-17238106
]
Maarten Breddels commented on ARROW-10709:
--
Pandas also does not like it when .read returns a
Maarten Breddels created ARROW-10709:
Summary: [Python] Difficult to make an efficient zero-copy file
reader in Python
Key: ARROW-10709
URL: https://issues.apache.org/jira/browse/ARROW-10709
[
https://issues.apache.org/jira/browse/ARROW-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17235563#comment-17235563
]
Maarten Breddels commented on ARROW-10640:
--
Yes, that would maybe be the 'ultimate' variant,
[
https://issues.apache.org/jira/browse/ARROW-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17234481#comment-17234481
]
Maarten Breddels commented on ARROW-10640:
--
Another idea would be to have a 'choose' like
[
https://issues.apache.org/jira/browse/ARROW-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229989#comment-17229989
]
Maarten Breddels commented on ARROW-9489:
-
Yes, I thought about that too. Although I think a
Maarten Breddels created ARROW-10557:
Summary: [C++] Add scalar string slicing/substring kernel
Key: ARROW-10557
URL: https://issues.apache.org/jira/browse/ARROW-10557
Project: Apache Arrow
Maarten Breddels created ARROW-10556:
Summary: [C++] Caching pre computed data based on FunctionOptions
in the kernel state
Key: ARROW-10556
URL: https://issues.apache.org/jira/browse/ARROW-10556
Maarten Breddels created ARROW-10541:
Summary: [C++] Add re2 library to core arrow / ARROW_WITH_RE2
Key: ARROW-10541
URL: https://issues.apache.org/jira/browse/ARROW-10541
Project: Apache Arrow
[
https://issues.apache.org/jira/browse/ARROW-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226572#comment-17226572
]
Maarten Breddels commented on ARROW-9489:
-
Yes, happy to take this on, since it's an ugly code
[
https://issues.apache.org/jira/browse/ARROW-9128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maarten Breddels reassigned ARROW-9128:
---
Assignee: Maarten Breddels
> [C++] Implement string space trimming kernels: trim,
Maarten Breddels created ARROW-10306:
Summary: [C++] Add string replacement kernel
Key: ARROW-10306
URL: https://issues.apache.org/jira/browse/ARROW-10306
Project: Apache Arrow
Issue
[
https://issues.apache.org/jira/browse/ARROW-9128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17213857#comment-17213857
]
Maarten Breddels commented on ARROW-9128:
-
Shall I implement this?
> [C++] Implement string
Maarten Breddels created ARROW-10209:
Summary: [Python] support positional arguments for options in
compute wrapper
Key: ARROW-10209
URL: https://issues.apache.org/jira/browse/ARROW-10209
Maarten Breddels created ARROW-10208:
Summary: [C++] comparing list arrays with nulls fails in test
framework
Key: ARROW-10208
URL: https://issues.apache.org/jira/browse/ARROW-10208
Project:
Maarten Breddels created ARROW-10207:
Summary: C++] Unary kernels that results in a list have no
preallocated offset buffer
Key: ARROW-10207
URL: https://issues.apache.org/jira/browse/ARROW-10207
Maarten Breddels created ARROW-10195:
Summary: [C++] Add string struct extract kernel using re2
Key: ARROW-10195
URL: https://issues.apache.org/jira/browse/ARROW-10195
Project: Apache Arrow
[
https://issues.apache.org/jira/browse/ARROW-10023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197892#comment-17197892
]
Maarten Breddels commented on ARROW-10023:
--
It's gonna be in C++, I can push an initial version
[
https://issues.apache.org/jira/browse/ARROW-9991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196786#comment-17196786
]
Maarten Breddels commented on ARROW-9991:
-
Indeed, and whatever Unicode specifies as 'whitespace'
[
https://issues.apache.org/jira/browse/ARROW-10023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196785#comment-17196785
]
Maarten Breddels commented on ARROW-10023:
--
Probably related to
[
https://issues.apache.org/jira/browse/ARROW-9991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maarten Breddels updated ARROW-9991:
Summary: [C++] split kernels for strings/binary (was: [C++] split kernsl
for
[
https://issues.apache.org/jira/browse/ARROW-9991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maarten Breddels updated ARROW-9991:
Description:
Similar to Python str.split and bytes.split, we'd like to have a way to
Maarten Breddels created ARROW-9991:
---
Summary: [C++] split kernsl for strings/binary
Key: ARROW-9991
URL: https://issues.apache.org/jira/browse/ARROW-9991
Project: Apache Arrow
Issue Type:
Maarten Breddels created ARROW-9471:
---
Summary: [C++] Scan Dataset in reverse
Key: ARROW-9471
URL: https://issues.apache.org/jira/browse/ARROW-9471
Project: Apache Arrow
Issue Type:
[
https://issues.apache.org/jira/browse/ARROW-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157387#comment-17157387
]
Maarten Breddels commented on ARROW-9458:
-
let me know if you want to do the honors yourself,
[
https://issues.apache.org/jira/browse/ARROW-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157374#comment-17157374
]
Maarten Breddels commented on ARROW-9458:
-
Indeed, seeing a massive speedup. Too bad py-spy
[
https://issues.apache.org/jira/browse/ARROW-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157340#comment-17157340
]
Maarten Breddels commented on ARROW-9458:
-
Did you set ?
batch_size=1_000_000
> [Python] Dataset
[
https://issues.apache.org/jira/browse/ARROW-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157338#comment-17157338
]
Maarten Breddels commented on ARROW-9458:
-
Running this (now with all columns)
{code:java}
[
https://issues.apache.org/jira/browse/ARROW-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maarten Breddels updated ARROW-9458:
Attachment: image-2020-07-14-14-38-16-767.png
> [Python] Dataset singlethreaded only
>
[
https://issues.apache.org/jira/browse/ARROW-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maarten Breddels updated ARROW-9458:
Attachment: image-2020-07-14-14-31-29-943.png
> [Python] Dataset singlethreaded only
>
[
https://issues.apache.org/jira/browse/ARROW-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maarten Breddels closed ARROW-9456.
---
Resolution: Not A Bug
> [Python] Dataset segfault when not importing pyarrow.parquet
>
[
https://issues.apache.org/jira/browse/ARROW-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157293#comment-17157293
]
Maarten Breddels commented on ARROW-9456:
-
Note that you should not run the vaex parquet example
[
https://issues.apache.org/jira/browse/ARROW-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157235#comment-17157235
]
Maarten Breddels commented on ARROW-9444:
-
Feel free to assign to me, I didn't know there was a
[
https://issues.apache.org/jira/browse/ARROW-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157231#comment-17157231
]
Maarten Breddels commented on ARROW-9456:
-
This file gives me the same problem
{code:java}
import
Maarten Breddels created ARROW-9458:
---
Summary: [Python] Dataset singlethreaded only
Key: ARROW-9458
URL: https://issues.apache.org/jira/browse/ARROW-9458
Project: Apache Arrow
Issue Type:
Maarten Breddels created ARROW-9456:
---
Summary: [Python] Dataset segfault when not importing
pyarrow.parquet
Key: ARROW-9456
URL: https://issues.apache.org/jira/browse/ARROW-9456
Project: Apache
Maarten Breddels created ARROW-9403:
---
Summary: [Python] add .tolist as alias of to_pylist
Key: ARROW-9403
URL: https://issues.apache.org/jira/browse/ARROW-9403
Project: Apache Arrow
Issue
[
https://issues.apache.org/jira/browse/ARROW-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maarten Breddels updated ARROW-9403:
Summary: [Python] add .tolist as alias of .to_pylist (was: [Python] add
.tolist as alias
Maarten Breddels created ARROW-9268:
---
Summary: [C++] Add is{alnum,alpha,...} kernels for strings
Key: ARROW-9268
URL: https://issues.apache.org/jira/browse/ARROW-9268
Project: Apache Arrow
Maarten Breddels created ARROW-9133:
---
Summary: [C++] Add utf8_upper and utf_lower
Key: ARROW-9133
URL: https://issues.apache.org/jira/browse/ARROW-9133
Project: Apache Arrow
Issue Type:
Maarten Breddels created ARROW-9131:
---
Summary: [C++] Faster ascii_lower and ascii_upper
Key: ARROW-9131
URL: https://issues.apache.org/jira/browse/ARROW-9131
Project: Apache Arrow
Issue
57 matches
Mail list logo