Over the weekend I realized just what strange and wonderful things tuples are. This message is as much to get my thoughts recorded as it is a request for discussion about the various classes and bits of code presented within. Some of this stuff may be useful for Boost and some may not. I'd like to hear what you all have to say. It may be that I'm simply restating the obvious to some, but I found the whole thing rather intriguing. I'll start from the beginning.
Last week I quickly wrote a prototype symbol manager for the students in a compiler class I am helping to teach. The implementation is a typical stack of symbol table objects so that names can be resolved at the proper scope. Naturally, I wanted to provide an iterator interface so they could examine the contents of the symbol manager. Because each symbol table is an autonomous entity living on the scope stack, there is no way to iterate "across" scopes. I've run into this sort of problem several times and have long wished for an iterator that can cross container boundaries. In the interest of time, I hacked together an interface to allow the students to iterate at a particular scope, but I'm not very satisfied with that solution. Over the weekend I had a bit of free time so I wrote up a spanning_iterator based on the iterator_adaptors framework. To throw in a bit more complication, I wanted something generic enough that I could span different kinds of container. There should be a way to span a std::vector and a std::list as long as their value_types are the same. The spanning_iterator holds two tuples -- one for the "current" iterators and the other for the end iterators for each container the spanning_iterator spans. The idea is that when the current "current" iterator reaches its end (the corresponding element in the "ends" tuple) the next iterator in the "current" tuple becomes current and so on until all "current" iterators reach their ends. Actually implementing the iterator_adaptor policies for this is conceptually straightforward: simply walk the "current" tuple until an iterator is found that is not at its end. Then simply operate on that iterator (dereference it, increment it, etc.). Getting this to work correctly with the tuple interface is non-trivial, however. The fundamental problem is that it's inconvenient to iterate through a tuple. All we have is the get<> template to access tuple elements. Iterating is again conceptually simple -- just increment an index. But the fact that get<> is a template implies the index must be a compile-time constant, meaning we need a compile-time loop (or, if you prefer, programmer-controlled loop unrolling) to iterate. As I went through design after design, I realized why this was so difficult to wrap my brain around: the tuple is neither a strictly compile-time nor run-time entity. On one end of the spectrum, we have MPL sequences. These are clearly compile-time only constructs. They hold type information that is completely available at compile-time. On the other end we have something like std::vector which provides only a run-time interface. Tuples sit somewhere in-between. They hold run-time values, but they are indexed at compile-time. This means that the sorts of things we want to do with tuples and how we want to iterate over them slide along a spectrum -- they change depending on how we want to view the tuple. For example, in the spanning_iterator, I want to apply an operation to a member of the tuple, but I don't know _which_ member until run-time. Because the spanning_iterator can hold different types of iterator, I don't even know at compile-time what kind of iterator is going to be affected. This has two consequences: we must use run-time checks to find the current element to operate upon and the operation itself must be passed to the looping construct because the loop cannot return a tuple element (how would we code the return type?). A static loop does us no good because the execution condition (whether an iterator is at its end) returns a run-time value. Furthermore, we need a loop that can "early out" once it finds the correct iterator. We don't want to go operating upon all of the iterators after the current one as well. Here's my solution: do_if_first. The idea is to have a loop with a run-time condition that is checked for executing the body and terminating and a compile-time condition for checking whether to iterate. We need static iteration because tuples are indexed statically. We need run-time execution because tuple values change at run-time. The if_first part tells us that the loop will terminate once it finds a tuple element that satifies the execution condition. namespace detail { template<typename Body, typename Result = typename Body::result_type> struct stop { // Called if no element satifies the execution condition static Result execute(typename Body::argument_type a) { return(Body::default_result(a)); }; // ... More execute members for different arity }; // ... More stop for void return type template<typename Body, typename Result = typename Body::result_type> struct do_if_first { static Result execute(typename Body::argument_type a) { // Execute the body if (Body::execute_condition(a)) { // Early out return(Body::execute(a)); } // Iterate return(boost::mpl::if_<typename Body::iterate_condition, do_if_first<typename Body::next>, // Note: call with current Body to // prevent invalid tuple indexing stop<Body> >::type::execute(a)); }; // ... More execute members for different arity }; // ... More do_if_first for void return type }; template<typename Body, typename Result = typename Body::result_type> struct do_if_first { static Result execute(typename Body::argument_type a) { return(detail::do_if_first<Body, Result>::execute(a)); }; // ... More execute members for different arity }; // ... More do_if_first for void return type Here's how do_if_first is used by the spanning_iterator dereference policy: template<typename IteratorAdaptor, typename Argument, typename Result, int i> struct body_base { typedef IteratorAdaptor iterator_adaptor; // Assume base_type has methods for getting iterators out of the // "current" and "end" tuples. typedef typename iterator_adaptor::base_type base_type; typedef typename base_type::iterator_tuple iterator_tuple; typedef Argument argument_type; typedef Result result_type; // loop variables enum { first = 0, last = boost::tuples::length<iterator_tuple>::value, index = i, }; // Don't want to instantiate with invalid tuple index. We only // iterate if the next index will be a valid one typedef boost::mpl::less<boost::mpl::integral_c<int, index>, boost::mpl::integral_c<int, last - 1> > iterate_condition; static bool execute_condition(const base_type &a) { // Is the current iterator not at the end? return((a.template get_iterator<index>() != a.template get_end<index>())); }; }; struct spanning_iterator_policy { // ... template<typename IteratorAdaptor, int i> struct dereference_body : public body_base<IteratorAdaptor, const typename IteratorAdaptor::base_type &, typename IteratorAdaptor::reference, i> { typedef body_base<IteratorAdaptor, const typename IteratorAdaptor::base_type &, typename IteratorAdaptor::reference, i> base_type; typedef dereference_body<typename base_type::iterator_adaptor, index + 1> next; static typename base_type::result_type execute(typename base_type::argument_type a) { // Remember, only called if get_iterator<index>() is not at end return(*a.template get_iterator<index>()); }; // Gets called if all iterators at end. Gets called with the // iteration state corresponding to the last "current" iterator. static typename base_type::result_type default_result(typename base_type::argument_type a) { assert(!"No valid dereference candidate"); // Satisfy the compiler // Would probably (hopefully!) crash return(*a.template get_iterator<index>()); }; }; template <class IteratorAdaptor> typename IteratorAdaptor::reference dereference(const IteratorAdaptor& x) const { // Start iterating (unrolling) at tuple index 0 return(do_if_first<dereference_body<IteratorAdaptor, 0> >:: execute(x.base())); }; }; Similar constructs are used for increment and decrement. I haven't yet made spanning_iterator smart enough to present itself as the "weakest" category of iterator contained in its tuples, but that should be a relatively easy matter. After thinking about this some more, I realized that several different looping constructs could be useful for tuples depending on whether they are viewed "more statically" or "more dynamically." Here's a (likely partial) list: do_all : operate on all tuple elements do_if_all : operate on all tuple elements satisfying some condition do_if_first : operate on first element satisfying some condition ("early out") do_until : operate as long as some condition doesn't hold ("early out") do_after : operate only after seeing that an element satisfies some condition ("late in") We could imagine more complicated loops such as do_if_until, etc. Then do_until simply becomes do_if_until with an always-true execute condition. We can then imagine coding similar loops that are more static (executing on elements with even indicies, for example, making the execute_condition static as well). All these loops are likely useful outside of the tuple domain but I haven't put any thought into that. Is any of this stuff useful for Boost? I think spanning_iterator is a useful thing and a (relatively) conveneient way to iterate over a tuple is nice to have. What are your thoughts? -Dave -- "Some little people have music in them, but Fats, he was all music, and you know how big he was." -- James P. Johnson _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost