[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16988129#comment-16988129 ] Antoine Pitrou commented on ARROW-5508: --- Is there something left to do here? It seems the {{Iterator}} interface and its {{Visit}} method are sufficient. > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 1.0.0 > > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924598#comment-16924598 ] Wes McKinney commented on ARROW-5508: - Moved out of 0.15.0 as it seems like there is more thinking to do on this > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 1.0.0 > > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917771#comment-16917771 ] Benjamin Kietzman commented on ARROW-5508: -- I think the existing interface is sufficient for our current use cases. WRT easier iteration, we currently have {{Iterator::Visit}}: {code} ASSERT_OK(iter.Visit([&](T v) { doSomethingWith(v); return Status::OK(); })); {code} > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 0.15.0 > > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917538#comment-16917538 ] Micah Kornfield commented on ARROW-5508: My only comment on this is might pay to sketch out what iteration looks like in each case. Using StopIteration and Result I think you get something like: while (RETURN_IF_NOT_STOPPED_ERROR(v, iter.next()) { doSomethingWith(v) } Not sure if the macros to make that nice are even vaible. "As for signaling completion without consuming a value, well, the problem is that not all iterators may support that. Is there a use case where this matters?" I think in many cases this can be solved with a "peek" style iterator? > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 0.15.0 > > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916031#comment-16916031 ] Antoine Pitrou commented on ARROW-5508: --- We could also take a page from Python and have a special {{Status::StopIteration}} to signal completion. I also think it's essential to be able to return errors - either via {{Status}} or {{Result}}. As for signaling completion without consuming a value, well, the problem is that not all iterators may support that. Is there a use case where this matters? > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 0.15.0 > > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916021#comment-16916021 ] Francois Saint-Jacques commented on ARROW-5508: --- ping for input: [~pitrou] [~bkietz] [~emkornfi...@gmail.com] > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 0.15.0 > > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16913838#comment-16913838 ] Francois Saint-Jacques commented on ARROW-5508: --- My take after implementing MapIterator, FlattenIterator. # `T` must be of pointer type (or support assignment/comparison of nullptr). Iterator completion is signaled by assigning `T* out` to nullptr. # Due to previous point, the iterator may never yield nullptr as a valid value. # The interface forces consuming a value to know if it's empty, i.e. there's no Done()/HasNext(). This can lead to [odd|https://github.com/fsaintjacques/arrow/commit/36ba801f47a1053c292fd461afd4ec23e63c1e97#diff-df9646433131d9cf9f31a395c2719b70R157-R190] consumption. # I question the user of Status as a return code, maybe we should have a specialized `FailableIterator : Iterator>` for the same effect. The first and second point could be tackled by returning `Option` (Result wouldn't work because we can't use Status::OK() as a sentinel-completion value). The third is annoying for streaming iterators (when there's no way to know completion without side effect), since the iterator itself must consume on Done() call and cache the result. I think I prefer putting the onus on the iterator implementor than the user of the interface. > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 0.15.0 > > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16913772#comment-16913772 ] Wes McKinney commented on ARROW-5508: - [~pitrou] [~bkietz] [~fsaintjacques] is the {{Iterator}} in the codebase now satisfactory? > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 0.15.0 > > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867739#comment-16867739 ] Benjamin Kietzman commented on ARROW-5508: -- 3. A separate out parameter of type {{bool*}} {code} template class Iterator { public: virtual ~Iterator() = default; virtual Status Next(T* value, bool* done) = 0; } {code} > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867492#comment-16867492 ] Liya Fan commented on ARROW-5508: - I still have not figured out how to check the end of the iterator by the Next method: # If it is through the return value, this could be inefficient - Status is a class, and there can be object copy when returning an Status object. # If it is through the method parameter, I think the parameter type should be T*& or T** ? > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864368#comment-16864368 ] Wes McKinney commented on ARROW-5508: - Let's wait until {{StatusOr}} lands and evaluate options. It might be nice to have {{Next}} return {{StatusOr}}. Ultimately this is about having an abstract base type like {{std::unique_ptr>}} to return from functions that deal in streams > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 0.14.0 > > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864152#comment-16864152 ] Benjamin Kietzman commented on ARROW-5508: -- Why not just use the [{{Range}} concept|https://en.cppreference.com/w/cpp/language/range-for]? We already have [arrow::util::LazyRange|https://github.com/apache/arrow/blob/aedba2c/cpp/src/arrow/util/lazy.h] which provides the functionality you want, I think. If you prefer the {{Iterator}} interface as you have written it then we should include an adapter so that range for loops are easy > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Fix For: 0.14.0 > > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863415#comment-16863415 ] Wes McKinney commented on ARROW-5508: - I added a prototype for this here, see the comment on the Next method https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/iterator.h I notice that I missed the virtual destructor. If this is the preferred Iterator interface, then I would suggest refactoring one or two of our ad hoc iterator classes in the codebase to use {{arrow::Iterator}} as a path to resolving this issue. Thoughts? > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Assignee: Wes McKinney >Priority: Major > Fix For: 0.14.0 > > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-5508) [C++] Create reusable Iterator interface
[ https://issues.apache.org/jira/browse/ARROW-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856229#comment-16856229 ] Liya Fan commented on ARROW-5508: - [~wesmckinn], thanks for the good point. What is the standard way to know if there is a next element in the iterator? > [C++] Create reusable Iterator interface > > > Key: ARROW-5508 > URL: https://issues.apache.org/jira/browse/ARROW-5508 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Assignee: Wes McKinney >Priority: Major > Fix For: 0.14.0 > > > We have various iterator-like classes. I envision a reusable interface like > {code} > template > class Iterator { > public: > virtual ~Iterator() = default; > virtual Status Next(T* out) = 0; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)