jsoref commented on code in PR #38896:
URL: https://github.com/apache/arrow/pull/38896#discussion_r1406489900
##########
cpp/src/gandiva/expr_validator.cc:
##########
@@ -79,7 +79,7 @@ Status ExprValidator::Visit(const FieldNode& node) {
Status::ExpressionValidationError("Field ",
node.field()->name(),
" not in schema."));
- // Ensure that that the found field match.
+ // Ensure that the found field matches.
Review Comment:
atypical correction
##########
dev/archery/archery/tests/fixtures/event-issue-comment-build-command.json:
##########
@@ -57,7 +57,7 @@
},
"repository_url": "https://api.github.com/repos/ursa-labs/ursabot",
"state": "open",
- "title": "Unittests for GithubHook",
+ "title": "Unittests for GitHubHook",
Review Comment:
I'm not sure if anything cares about this text.
The contents of https://api.github.com/repositories/169101701/issues/26
don't match the content here...
##########
python/pyarrow/tests/test_fs.py:
##########
@@ -1303,12 +1303,12 @@ def test_s3_proxy_options(monkeypatch, pickle_module):
# Missing port
with pytest.raises(KeyError):
S3FileSystem(proxy_options={'scheme': 'http', 'host': 'localhost'})
- # Invalid proxy URI (invalid scheme htttps)
+ # Invalid proxy URI (invalid scheme httpsB)
with pytest.raises(pa.ArrowInvalid):
- S3FileSystem(proxy_options='htttps://localhost:9000')
- # Invalid proxy_options dict (invalid scheme htttps)
+ S3FileSystem(proxy_options='httpsB://localhost:9000')
+ # Invalid proxy_options dict (invalid scheme httpA)
Review Comment:
The comment here was incorrect...
##########
r/vignettes/data_types.Rmd:
##########
@@ -34,7 +34,7 @@ When the arrow package converts between R data and Arrow
data, it will first che
knitr::include_graphics("./data_types.png")
```
-In this image, black boxes refer to R data types and light blue boxes refer to
Arrow data types. Directional arrows specify conversions (e.g., the
bidirectional arrow between the logical R type and the boolean Arrow type means
that R logicals convert to Arrow booleans and vice versa). Solid lines indicate
that the this conversion rule is always the default; dashed lines mean that it
only sometimes applies (the rules and special cases are described below).
+In this image, black boxes refer to R data types and light blue boxes refer to
Arrow data types. Directional arrows specify conversions (e.g., the
bidirectional arrow between the logical R type and the boolean Arrow type means
that the logical R converts to an Arrow boolean and vice versa). Solid lines
indicate that this conversion rule is always the default; dashed lines mean
that it only sometimes applies (the rules and special cases are described
below).
Review Comment:
Notable changes
##########
cpp/src/arrow/compute/kernels/vector_selection_filter_internal.cc:
##########
@@ -171,7 +171,7 @@ class PrimitiveFilterImpl {
}
if (out_arr->buffers[0] != nullptr) {
- // May not be allocated if neither filter nor values contains nulls
+ // May be unallocated if neither filter nor values contain nulls
Review Comment:
slightly different change
##########
cpp/src/arrow/filesystem/test_util.h:
##########
@@ -170,7 +170,7 @@ class ARROW_TESTING_EXPORT GenericFileSystemTest {
virtual bool allow_move_dir_over_non_empty_dir() const { return false; }
// - Whether the filesystem allows appending to a file
virtual bool allow_append_to_file() const { return true; }
- // - Whether the filesystem allows appending to a new (not existent yet) file
+ // - Whether the filesystem allows appending to a new (not yet extant) file
Review Comment:
atypical correction
##########
c_glib/parquet-glib/arrow-file-reader.cpp:
##########
@@ -123,7 +123,7 @@
gparquet_arrow_file_reader_class_init(GParquetArrowFileReaderClass *klass)
/**
* gparquet_arrow_file_reader_new_arrow:
* @source: Arrow source to be read.
- * @error: (nullable): Return locatipcn for a #GError or %NULL.
+ * @error: (nullable): Return location for a #GError or %NULL.
Review Comment:
I had a lot of trouble with this as I'm not sure what this means if this is
the correction
##########
cpp/src/arrow/util/ree_util_test.cc:
##########
@@ -101,7 +101,7 @@ TYPED_TEST_P(ReeUtilTest, PhysicalLength) {
ASSERT_EQ(internal::FindPhysicalLength(run_ends246, 4, 0, 7), 0);
}
-TYPED_TEST_P(ReeUtilTest, MergedRunsInterator) {
+TYPED_TEST_P(ReeUtilTest, MergedRunsIteratorTest) {
Review Comment:
The obvious correction collided with a core class...
##########
cpp/src/parquet/level_conversion_test.cc:
##########
@@ -127,7 +127,7 @@ TEST(DefLevelsToBitmap,
WithRepetitionLevelFiltersOutEmptyListValues) {
level_info.repeated_ancestor_def_level = 1;
level_info.def_level = 2;
level_info.rep_level = 1;
- // All zeros should be ignored, ones should be unset in the bitmp and 2
should be set.
+ // All zeros should be ignored, ones should be unset in the bitmap and 2
should be set.
Review Comment:
sometimes there are specialized classes, but I think this is just a typo
##########
dev/archery/archery/crossbow/tests/fixtures/crossbow-job-no-failure.yaml:
##########
@@ -1,7 +1,7 @@
!Job
target: !Target
head: f766a1d615dd1b7ee706d05102e579195951a61c
- email: unkown
+ email: unknown
Review Comment:
I'm not sure if archery/crossbow will care about this
##########
csharp/src/Apache.Arrow/Flatbuf/FlatBuffers/ByteBuffer.cs:
##########
@@ -208,7 +208,7 @@ public static int SizeOf<T>()
/// Checks if the Type provided is supported as scalar value
/// </summary>
/// <typeparam name="T">The Type to check</typeparam>
- /// <returns>True if the type is a scalar type that is supported,
falsed otherwise</returns>
+ /// <returns>True if the type is a scalar type that is supported,
false otherwise</returns>
Review Comment:
I've sent a PR to the flatbuffers repository -- so, these changes could be
dropped and pulled in after it merges.
##########
go/arrow/array/binary_test.go:
##########
@@ -705,7 +705,7 @@ func TestBinaryViewStringRoundTrip(t *testing.T) {
mem := memory.NewCheckedAllocator(memory.DefaultAllocator)
defer mem.AssertSize(t, 0)
- values := []string{"a", "bc", "", "", "supercalifragilistic", "",
"expeallodocious"}
+ values := []string{"a", "bc", "", "", "supercalifragilistic", "",
"expialidocious"}
Review Comment:
I don't think it cares too much, but this is the proper spelling...
##########
dev/archery/archery/release/core.py:
##########
@@ -468,7 +468,7 @@ def curate(self, minimal=False):
parquet.append((self.jira_instance.issue(c.issue), c))
else:
warnings.warn(
- f'Issue {c.issue} is not MINOR nor pertains to GH' +
+ f'Issue {c.issue} does not pertain to GH' +
Review Comment:
the logic above didn't really support the message here...
##########
python/pyarrow/tests/test_fs.py:
##########
@@ -1303,12 +1303,12 @@ def test_s3_proxy_options(monkeypatch, pickle_module):
# Missing port
with pytest.raises(KeyError):
S3FileSystem(proxy_options={'scheme': 'http', 'host': 'localhost'})
- # Invalid proxy URI (invalid scheme htttps)
+ # Invalid proxy URI (invalid scheme httpsB)
with pytest.raises(pa.ArrowInvalid):
- S3FileSystem(proxy_options='htttps://localhost:9000')
Review Comment:
Swaps one misspelled item for something that is invalid but not misspelled
##########
go/arrow/datatype_extension.go:
##########
@@ -117,7 +117,7 @@ type ExtensionType interface {
// concurrently.
Serialize() string
// Deserialize is called when reading in extension arrays and types via
the IPC format
- // in order to construct an instance of the appropriate extension type.
The data passed in
+ // in order to construct an instance of the appropriate extension type.
The passed in data
Review Comment:
this is a grammatical change that I believe improves the text
##########
java/dataset/src/main/cpp/jni_util.cc:
##########
@@ -192,9 +192,9 @@ std::string Describe(JNIEnv* env, jthrowable t) {
}
bool IsErrorInstanceOf(JNIEnv* env, jthrowable t, std::string class_name) {
- jclass jclass = env->FindClass(class_name.c_str());
- DCHECK_NE(jclass, nullptr) << "Could not find Java class " << class_name;
- return env->IsInstanceOf(t, jclass);
+ jclass java_class = env->FindClass(class_name.c_str());
+ DCHECK_NE(java_class, nullptr) << "Could not find Java class " << class_name;
+ return env->IsInstanceOf(t, java_class);
Review Comment:
c++ doesn't appear to care, but this makes reading the code easier
##########
go/parquet/pqarrow/encode_arrow.go:
##########
@@ -141,7 +141,7 @@ func newArrowColumnWriter(data *arrow.Chunked, offset, size
int64, manifest *Sch
return arrowColumnWriter{}, nil
}
if leafCount != bldr.leafCount() {
- return arrowColumnWriter{}, fmt.Errorf("data
type leaf_count != builder leafcount: %d - %d", leafCount, bldr.leafCount())
+ return arrowColumnWriter{}, fmt.Errorf("data
type leaf_count != builder leaf_count: %d - %d", leafCount, bldr.leafCount())
Review Comment:
slightly atypical change
##########
java/flight/flight-sql-jdbc-core/src/test/java/org/apache/arrow/driver/jdbc/ArrowFlightJdbcDriverTest.java:
##########
@@ -142,15 +142,15 @@ public void testConnectWithInsensitiveCasePropertyKeys()
throws Exception {
driver.connect("jdbc:arrow-flight://" +
dataSource.getConfig().getHost() + ":" +
dataSource.getConfig().getPort() + "?" +
- "UseEncryptiOn=false",
+ "UseEncryptIon=false",
Review Comment:
The code is just trying to ensure that it's case insensitive -- this
variation honors that by using two dictionary words.
##########
python/CMakeLists.txt:
##########
@@ -1,5 +1,5 @@
# Licensed to the Apache Software Foundation (ASF) under one
-# or more cod ntributor license agreements. See the NOTICE file
+# or more contributor license agreements. See the NOTICE file
Review Comment:
Lawyers tend to hate changes to license files...
##########
python/pyarrow/dataset.py:
##########
@@ -534,7 +534,7 @@ def parquet_dataset(metadata_path, schema=None,
filesystem=None, format=None,
partitioning : Partitioning, PartitioningFactory, str, list of str
The partitioning scheme specified with the ``partitioning()``
function. A flavor string can be used as shortcut, and with a list of
- field names a DirectionaryPartitioning will be inferred.
+ field names a DirectoryPartitioning will be inferred.
Review Comment:
notable correction
##########
python/pyarrow/table.pxi:
##########
@@ -449,7 +449,7 @@ cdef class ChunkedArray(_PandasConvertible):
>>> import pyarrow as pa
>>> n_legs = pa.chunked_array([[2, 2, 4], [4, 5, 100]])
>>> animals = pa.chunked_array((
- ... ["Flamingo", "Parot", "Dog"],
+ ... ["Flamingo", "Parrot", "Dog"],
Review Comment:
I'm assuing this code doesn't care about spelling...
##########
ruby/red-arrow/lib/arrow/sort-key.rb:
##########
@@ -79,9 +79,9 @@ def try_convert(value)
# target and corresponding order is used. `"+"` uses ascending
# order and `"-"` uses ascending order.
#
- # If `target` is not a String nor `target` doesn't start with the
- # leading order mark, sort column target is `target` as-is and
- # ascending order is used.
+ # If `target` is either not a String or `target` doesn't start
+ # with the leading order mark, sort column is `target` as-is
+ # and ascending order is used.
Review Comment:
notable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]