rdettai commented on a change in pull request #6935:
URL: https://github.com/apache/arrow/pull/6935#discussion_r414417688
##
File path: rust/parquet/src/column/reader.rs
##
@@ -190,15 +190,12 @@ impl ColumnReaderImpl {
(self.num_buffered_values -
zhztheplayer opened a new pull request #7030:
URL: https://github.com/apache/arrow/pull/7030
Add following Datasets APIs to Java:
- DatasetFactory
- Dataset
- Scanner
- ScanTask
Add a native dataset path to bridge c++ Datasets components to Java:
-
jianxind commented on pull request #7029:
URL: https://github.com/apache/arrow/pull/7029#issuecomment-618855696
cc @emkornfield
The AVX512 path is straightforward as the helper of
mask_compress/mask_expand API provide by AVX512. For potential path-finding of
SSE/AVX2, as you
kiszk commented on a change in pull request #7029:
URL: https://github.com/apache/arrow/pull/7029#discussion_r414434809
##
File path: cpp/src/arrow/util/spaced.h
##
@@ -0,0 +1,266 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license
kiszk commented on a change in pull request #7029:
URL: https://github.com/apache/arrow/pull/7029#discussion_r414434434
##
File path: cpp/src/arrow/util/spaced.h
##
@@ -0,0 +1,266 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license
nevi-me commented on pull request #7018:
URL: https://github.com/apache/arrow/pull/7018#issuecomment-618942929
> @nevi-me This is looking good, but the generated source file needs the ASF
header. CI is failing with ` apache-rat license violation:
github-actions[bot] commented on pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#issuecomment-618919224
https://issues.apache.org/jira/browse/ARROW-7808
This is an automated message from the Apache Git
jianxind opened a new pull request #7029:
URL: https://github.com/apache/arrow/pull/7029
1. Create the spaced encoding/decoding benchmark items.
2. Create unittest for spaced API SIMD implementation.
3. Move spaced scalar/SIMD to a new head file.
4. Add the path of AVX512 epi32 and
kszucs edited a comment on pull request #7028:
URL: https://github.com/apache/arrow/pull/7028#issuecomment-618944665
There is another security constraint about this approach: anyone can trigger
a rebase on the PR not just the participants / committers. To resolve that you
need to check
kszucs commented on pull request #7028:
URL: https://github.com/apache/arrow/pull/7028#issuecomment-618944665
There is another security constraint about this approach: anyone can trigger
a rebase on the PR not just the participants. To resolve that you need to check
`author_association`
github-actions[bot] commented on pull request #7029:
URL: https://github.com/apache/arrow/pull/7029#issuecomment-618853399
https://issues.apache.org/jira/browse/ARROW-8579
This is an automated message from the Apache Git
pitrou commented on pull request #7029:
URL: https://github.com/apache/arrow/pull/7029#issuecomment-618917554
I'd gladly see a AVX2 or SSE version indeed, as many CPUs don't have AVX512.
This is an automated message from the
liyafan82 commented on a change in pull request #6323:
URL: https://github.com/apache/arrow/pull/6323#discussion_r414332379
##
File path:
java/memory/src/test/java/org/apache/arrow/memory/TestLargeArrowBuf.java
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software
liyafan82 commented on a change in pull request #6323:
URL: https://github.com/apache/arrow/pull/6323#discussion_r414330656
##
File path:
java/memory/src/main/java/org/apache/arrow/memory/NettyAllocationManager.java
##
@@ -34,31 +33,34 @@
static final
liyafan82 commented on a change in pull request #6323:
URL: https://github.com/apache/arrow/pull/6323#discussion_r414330471
##
File path:
java/memory/src/main/java/org/apache/arrow/memory/NettyAllocationManager.java
##
@@ -34,31 +33,34 @@
static final
liyafan82 commented on a change in pull request #6323:
URL: https://github.com/apache/arrow/pull/6323#discussion_r414330317
##
File path:
java/memory/src/main/java/org/apache/arrow/memory/NettyAllocationManager.java
##
@@ -34,31 +33,34 @@
static final
kszucs edited a comment on pull request #7028:
URL: https://github.com/apache/arrow/pull/7028#issuecomment-618944665
There is another security constraint about this approach: anyone can trigger
a rebase on the PR not just the participants / committers. To resolve that you
need to check
jianxind commented on a change in pull request #7029:
URL: https://github.com/apache/arrow/pull/7029#discussion_r414505645
##
File path: cpp/src/arrow/util/spaced.h
##
@@ -0,0 +1,266 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor
fsaintjacques commented on pull request #7022:
URL: https://github.com/apache/arrow/pull/7022#issuecomment-618985822
Could you accompany a script/utility to compute both metrics? Paired with
toxiproxy, we could replicate S3 regions behavior with localhost.
rdettai commented on pull request #6949:
URL: https://github.com/apache/arrow/pull/6949#issuecomment-618997029
> Originally we designed it this way so that we can concurrently read
multiple column chunks after obtaining file handle from a single row group.
Since the file handle is shared
rdettai edited a comment on pull request #6949:
URL: https://github.com/apache/arrow/pull/6949#issuecomment-618997029
> Originally we designed it this way so that we can concurrently read
multiple column chunks after obtaining file handle from a single row group.
Since the file handle is
sunchao commented on a change in pull request #6935:
URL: https://github.com/apache/arrow/pull/6935#discussion_r414738189
##
File path: rust/parquet/src/column/reader.rs
##
@@ -190,15 +190,12 @@ impl ColumnReaderImpl {
(self.num_buffered_values -
BryanCutler commented on a change in pull request #6323:
URL: https://github.com/apache/arrow/pull/6323#discussion_r414695756
##
File path:
java/memory/src/main/java/org/apache/arrow/memory/NettyAllocationManager.java
##
@@ -17,48 +17,97 @@
package org.apache.arrow.memory;
nealrichardson commented on pull request #7028:
URL: https://github.com/apache/arrow/pull/7028#issuecomment-619074972
I'm not worried about security risks in this particular case. If someone
random person wants to rebase my PR on apache/arrow@master, great! Now I don't
have to! While I
nevi-me commented on a change in pull request #6306:
URL: https://github.com/apache/arrow/pull/6306#discussion_r414662522
##
File path: rust/arrow/src/compute/kernels/sort.rs
##
@@ -0,0 +1,671 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more
nevi-me commented on a change in pull request #6306:
URL: https://github.com/apache/arrow/pull/6306#discussion_r414663303
##
File path: rust/arrow/src/compute/kernels/sort.rs
##
@@ -0,0 +1,671 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more
nealrichardson commented on a change in pull request #7026:
URL: https://github.com/apache/arrow/pull/7026#discussion_r414680234
##
File path: r/src/expression.cpp
##
@@ -21,99 +21,97 @@
// [[arrow::export]]
std::shared_ptr dataset___expr__field_ref(std::string name) {
-
kiszk commented on a change in pull request #7029:
URL: https://github.com/apache/arrow/pull/7029#discussion_r414739332
##
File path: cpp/src/arrow/util/spaced.h
##
@@ -0,0 +1,266 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license
sunchao commented on a change in pull request #6935:
URL: https://github.com/apache/arrow/pull/6935#discussion_r414738189
##
File path: rust/parquet/src/column/reader.rs
##
@@ -190,15 +190,12 @@ impl ColumnReaderImpl {
(self.num_buffered_values -
sunchao commented on pull request #6949:
URL: https://github.com/apache/arrow/pull/6949#issuecomment-619140531
> It's the reader (file handle) that is passed to it that should be thread
safe
Is [file](https://doc.rust-lang.org/std/fs/struct.File.html) thread-safe?
it's not obvious
pitrou opened a new pull request #7031:
URL: https://github.com/apache/arrow/pull/7031
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
fsaintjacques commented on pull request #7022:
URL: https://github.com/apache/arrow/pull/7022#issuecomment-619186142
I'd say just plain HTTP, as @lidavidm pointed in his comment, this is a
network attribute.
This is an
andygrove commented on pull request #7018:
URL: https://github.com/apache/arrow/pull/7018#issuecomment-619202633
Let's see what others say on this. Personally, I think it would be better
for build.rs to automatically prepend the ASF license header because there is
the risk of someone
github-actions[bot] commented on pull request #7033:
URL: https://github.com/apache/arrow/pull/7033#issuecomment-619195315
https://issues.apache.org/jira/browse/ARROW-7759
This is an automated message from the Apache Git
Zhen-hao opened a new issue #7034:
URL: https://github.com/apache/arrow/issues/7034
Hi there,
this is more a question than a bug request.
I am using NixOS 20.03 and couldn't get the arrow library in R to see the
arrow C++ library.
Even when I install the library from R
nealrichardson commented on pull request #7033:
URL: https://github.com/apache/arrow/pull/7033#issuecomment-619213038
@github-actions rebase
This is an automated message from the Apache Git Service.
To respond to the
nealrichardson commented on pull request #6879:
URL: https://github.com/apache/arrow/pull/6879#issuecomment-619212781
@github-actions rebase
This is an automated message from the Apache Git Service.
To respond to the
markhildreth opened a new pull request #7035:
URL: https://github.com/apache/arrow/pull/7035
Fixes [ARROW-8590](https://issues.apache.org/jira/browse/ARROW-8590)
This builds on #6972, and thus should be merged after that PR is merged.
markhildreth commented on pull request #6972:
URL: https://github.com/apache/arrow/pull/6972#issuecomment-619215502
Created [follow-up JIRA
task](https://issues.apache.org/jira/browse/ARROW-8590).
This is an automated
vertexclique opened a new pull request #7036:
URL: https://github.com/apache/arrow/pull/7036
This PR enables reverse lookup for already built dict.
This is an automated message from the Apache Git Service.
To respond to the
zgramana opened a new pull request #7032:
URL: https://github.com/apache/arrow/pull/7032
Takes an alternative approach to completing
[ARROW-6603](https://issues.apache.org/jira/browse/ARROW-6603) that is in-line
with the current API and with other Arrow implementations. More
mayuropensource commented on pull request #7022:
URL: https://github.com/apache/arrow/pull/7022#issuecomment-619184138
@fsaintjacques, I can try to put together a python script using boto to
determine the S3 metrics. Will that work for you?
nealrichardson commented on issue #7034:
URL: https://github.com/apache/arrow/issues/7034#issuecomment-619210756
We don't do any testing on NixOS, so it's not surprising that it doesn't
just work.
http://arrow.apache.org/docs/r/articles/install.html describes how
dependencies are
github-actions[bot] commented on pull request #7031:
URL: https://github.com/apache/arrow/pull/7031#issuecomment-619156566
https://issues.apache.org/jira/browse/ARROW-8587
This is an automated message from the Apache Git
github-actions[bot] commented on pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#issuecomment-619164076
https://issues.apache.org/jira/browse/ARROW-6603
This is an automated message from the Apache Git
nevi-me commented on pull request #7024:
URL: https://github.com/apache/arrow/pull/7024#issuecomment-619189091
@paddyhoran we might have to try a different nightly, as sometimes a day's
version might have no rustfmt. The change I made in that PR installs a nightly
version, I don't know
bkietz opened a new pull request #7033:
URL: https://github.com/apache/arrow/pull/7033
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
github-actions[bot] commented on pull request #7035:
URL: https://github.com/apache/arrow/pull/7035#issuecomment-619219686
https://issues.apache.org/jira/browse/ARROW-8590
This is an automated message from the Apache Git
github-actions[bot] commented on pull request #7036:
URL: https://github.com/apache/arrow/pull/7036#issuecomment-619219685
https://issues.apache.org/jira/browse/ARROW-8591
This is an automated message from the Apache Git
zgramana commented on pull request #6121:
URL: https://github.com/apache/arrow/pull/6121#issuecomment-619162558
@eerhardt I've just submitted https://github.com/apache/arrow/pull/7032 for
review/discussion
This is an
bkietz commented on a change in pull request #7033:
URL: https://github.com/apache/arrow/pull/7033#discussion_r414856596
##
File path: cpp/src/arrow/dataset/file_csv.h
##
@@ -0,0 +1,52 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor
markhildreth edited a comment on pull request #7024:
URL: https://github.com/apache/arrow/pull/7024#issuecomment-619273713
@andygrove I think there is going to be more to this than this PR. The
"nightly-2019-11-14" string [can be found in a few
velvia commented on a change in pull request #4815:
URL: https://github.com/apache/arrow/pull/4815#discussion_r414877852
##
File path: format/Message.fbs
##
@@ -21,10 +21,69 @@ include "Tensor.fbs";
namespace org.apache.arrow.flatbuf;
+///
markhildreth commented on pull request #7024:
URL: https://github.com/apache/arrow/pull/7024#issuecomment-619273713
@andygrove I think there is going to be more to this than this PR. The
"nightly-2019-11-14" string [can be found in a few
nevi-me commented on a change in pull request #7036:
URL: https://github.com/apache/arrow/pull/7036#discussion_r414875192
##
File path: rust/arrow/src/array/array.rs
##
@@ -1786,38 +1786,34 @@ impl From<(Vec<(Field, ArrayRef)>, Buffer, usize)> for
StructArray {
/// This is
nealrichardson commented on a change in pull request #7033:
URL: https://github.com/apache/arrow/pull/7033#discussion_r414852277
##
File path: cpp/src/arrow/dataset/file_csv.h
##
@@ -0,0 +1,52 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more
nevi-me opened a new pull request #7037:
URL: https://github.com/apache/arrow/pull/7037
This removes the dependency on packed_simd. I initially thought that boolean
kernels were slower than with explicit SIMD, but this was a false alarm as the
benchmarks weren't comparing SIMD vs
github-actions[bot] commented on pull request #7037:
URL: https://github.com/apache/arrow/pull/7037#issuecomment-619242490
https://issues.apache.org/jira/browse/ARROW-6718
This is an automated message from the Apache Git
mayuropensource edited a comment on pull request #7022:
URL: https://github.com/apache/arrow/pull/7022#issuecomment-619276182
// SOME_S3_DATA_URI should point to a file (over http) that is ~500 MiB.
// TTFB_sec is the time-to-first-byte in seconds as measured by curl
//
mayuropensource commented on pull request #7022:
URL: https://github.com/apache/arrow/pull/7022#issuecomment-619276182
// SOME_S3_DATA_URI should point to a file (over http) that is ~500 MiB.
// TTFB_sec is the time-to-first-byte in seconds as measured by curl
//
emkornfield commented on a change in pull request #6912:
URL: https://github.com/apache/arrow/pull/6912#discussion_r414972130
##
File path: java/vector/src/main/java/org/apache/arrow/vector/ValueVector.java
##
@@ -283,4 +283,10 @@
* @return the name of the vector.
*/
emkornfield commented on pull request #7025:
URL: https://github.com/apache/arrow/pull/7025#issuecomment-619317374
@chrish42 Thank you for the PR, I'll take a look now. Note it looks like
lint is failing due to formatting issues. You need to run "make format" or
"ninja format" to run
wjones1 commented on a change in pull request #6979:
URL: https://github.com/apache/arrow/pull/6979#discussion_r414951095
##
File path: python/pyarrow/_parquet.pyx
##
@@ -1083,6 +1084,50 @@ cdef class ParquetReader:
def set_use_threads(self, bint use_threads):
emkornfield commented on a change in pull request #7025:
URL: https://github.com/apache/arrow/pull/7025#discussion_r414974208
##
File path: cpp/src/plasma/store.cc
##
@@ -1207,65 +1211,77 @@ void StartServer(char* socket_name, std::string
plasma_directory, bool hugepages
zgramana commented on pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#issuecomment-619296269
@eerhardt apologies for loading up three issue in the title, but I kept
finding older issues in the Apache Jira backlog that were addressed here as
well, and so erred on the
wjones1 commented on a change in pull request #6979:
URL: https://github.com/apache/arrow/pull/6979#discussion_r414948745
##
File path: cpp/src/parquet/arrow/reader.cc
##
@@ -260,12 +260,28 @@ class FileReaderImpl : public FileReader {
Status GetRecordBatchReader(const
wjones1 commented on a change in pull request #6979:
URL: https://github.com/apache/arrow/pull/6979#discussion_r414948789
##
File path: python/pyarrow/_parquet.pxd
##
@@ -334,7 +334,7 @@ cdef extern from "parquet/api/reader.h" namespace "parquet"
nogil:
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414987154
##
File path:
java/dataset/src/main/java/org/apache/arrow/dataset/jni/JniWrapper.java
##
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414987685
##
File path:
java/dataset/src/main/java/org/apache/arrow/dataset/scanner/ScanTask.java
##
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414987799
##
File path:
java/dataset/src/main/java/org/apache/arrow/dataset/scanner/ScanTask.java
##
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414988755
##
File path: java/dataset/src/test/java/org/apache/arrow/util/SchemaUtilsTest.java
##
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414988831
##
File path: java/pom.xml
##
@@ -369,24 +369,24 @@
org.apache.maven.plugins
maven-compiler-plugin
3.6.2
-
-
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414988474
##
File path: java/dataset/src/main/java/org/apache/arrow/util/SchemaUtils.java
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414989937
##
File path: cpp/src/jni/dataset/jni_wrapper.cpp
##
@@ -0,0 +1,577 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414990082
##
File path: cpp/src/jni/dataset/jni_wrapper.cpp
##
@@ -0,0 +1,577 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more
emkornfield commented on a change in pull request #7025:
URL: https://github.com/apache/arrow/pull/7025#discussion_r414975521
##
File path: cpp/src/plasma/store.cc
##
@@ -1207,65 +1211,77 @@ void StartServer(char* socket_name, std::string
plasma_directory, bool hugepages
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414983499
##
File path: cpp/src/jni/dataset/concurrent_map.h
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414984380
##
File path: cpp/src/jni/dataset/jni_wrapper.cpp
##
@@ -0,0 +1,577 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414984900
##
File path: cpp/src/jni/dataset/jni_wrapper.cpp
##
@@ -0,0 +1,577 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414985230
##
File path: cpp/src/jni/dataset/jni_wrapper.cpp
##
@@ -0,0 +1,577 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414986335
##
File path: cpp/src/jni/dataset/proto/Types.proto
##
@@ -0,0 +1,149 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414986841
##
File path:
java/dataset/src/main/java/org/apache/arrow/dataset/file/JniWrapper.java
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414987860
##
File path:
java/dataset/src/main/java/org/apache/arrow/dataset/source/DatasetFactory.java
##
@@ -0,0 +1,34 @@
+/*
+ * Licensed to the Apache
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414988086
##
File path:
java/dataset/src/main/java/org/apache/arrow/memory/NativeUnderlingMemory.java
##
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414988443
##
File path: java/dataset/src/main/java/org/apache/arrow/util/SchemaUtils.java
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation
emkornfield commented on a change in pull request #7030:
URL: https://github.com/apache/arrow/pull/7030#discussion_r414989258
##
File path:
java/dataset/src/test/java/org/apache/arrow/dataset/jni/NativeDatasetTest.java
##
@@ -0,0 +1,209 @@
+/*
+ * Licensed to the Apache
86 matches
Mail list logo