Hi Niko, Thank you for your response. I forgot to put it in the original message, but here is a link to bitbucket repository with all the code I got so far: https://bitbucket.org/googolplex/algo/src, see module io. Maybe it will be helpful.
I'm still inclined to think that this is a bug, and I really do not see how can I do what I want to with two lifetime parameters. There really should be one lifetime (the one of the Reader), or maybe I'm missing something? As for internal iterators, yes, I tried to do that first, but since it is impossible to break out or return from inside of the iterator loop directly via `break` or `return` it quickly becomes pretty unfeasible since I have to use a lot of boilerplate boolean parameters and check them in many places. This may be not so apparent for finite structures, but for potentially infinite ones (like a wrapper for `Reader`) it is. Nonetheless, as far as I understand, external iterators are the future of iteration in Rust (generators won't appear in the nearest future, will they?), so I want to use the most idiomatic style. 2013/8/20 Niko Matsakis <[email protected]>: > Hi, > > Sorry for not responding more quickly. I've been wanting to sit down > and work out your example; I am confident that it can be made to work, > although from reading it quickly it sounds like a case that might be > better served with two lifetime parameters, which are not yet > supported (on my list...). > > However, I did want to briefly point out that you can continue to use > "internal" iterators, you just don't get the `for` syntax > anymore. Just write a higher-order function as you always did, > possibly returning bool to indicate whether to break or continue. > > > Niko > > > On Sat, Aug 17, 2013 at 01:54:09PM +0400, Vladimir Matveev wrote: >> Hello, >> >> I'm writing a simple tokenizer which is defined by this trait: >> >> trait Tokenizer { >> fn next_token(&mut self) -> ~str; >> fn eof(&self) -> bool; >> } >> >> Obvious application for a tokenizer is splitting a stream going from >> Reader, so I have the following structure which should implement >> Tokenizer: >> >> pub struct ReaderTokenizer<'self> { >> priv inner: &'self Reader, >> priv buffer: ~CyclicBuffer, >> priv seps: ~[~str] >> } >> >> I have used 'self lifetime parameter since I want for the tokenizer >> work for any Reader. CyclicBuffer is another structure which >> essentially is an array of u8 with special read/write operations. >> >> Implementation of a Tokenizer for ReaderTokenizer involves reading >> from the Reader one byte at a time. I decided to use buffering to >> improve performance. But I still want to keep the useful abstraction >> of single byte reading, so I decided to implement Iterator<u8> for my >> Reader+CyclicBuffer pair. BTW, internal iterators in 0.7 were much >> better for this, because internal iterator code was very simple and >> didn't use explicit lifetimes at all, but 0.7 compiler suffers from >> several errors related to pointers to traits which prevented my >> program from compiling (I couldn't pass a reference to Reader to >> CyclicBuffer method; there were other errors I've encountered too). I >> So, I decided to use trunk version of the compiler in which these >> errors are resolved according to github, but trunk version does not >> allow internal iterators, which is very sad since now I'm forced to >> create intermediate structures to achieve the same thing. >> >> So, I came up with the following iterator structure: >> >> struct RTBytesIterator<'self> { >> tokenizer: &'self mut ReaderTokenizer<'self> >> } >> >> impl<'self> Iterator<u8> for RTBytesIterator<'self> { >> fn next(&mut self) -> Option<u8> { >> if self.tokenizer.eof() { >> return None; >> } >> if self.tokenizer.buffer.readable_bytes() > 0 || >> self.tokenizer.buffer.fill_from_reader(self.tokenizer.inner) > 0 { >> return Some(self.tokenizer.buffer.read_unsafe()); >> } else { >> return None; >> } >> } >> } >> >> Note that tokenizer field is &'self mut since CyclicBuffer is mutable. >> buffer.fill_from_reader() function reads as much as possible from the >> reader (returning a number of bytes read), and buffer.read_unsafe() >> returns next byte from the cyclic buffer. >> >> Then I've added the following method to ReaderTokenizer: >> >> impl<'self> ReaderTokenizer<'self> { >> ... >> fn bytes_iter(&mut self) -> RTBytesIterator<'self> { >> RTBytesIterator { tokenizer: self } >> } >> ... >> } >> >> This does not compile with the following error: >> >> io/convert_io.rs:98:37: 98:43 error: cannot infer an appropriate >> lifetime due to conflicting requirements >> io/convert_io.rs:98 RTBytesIterator { tokenizer: self } >> ^~~~~~ >> io/convert_io.rs:97:55: 99:5 note: first, the lifetime cannot outlive >> the anonymous lifetime #1 defined on the block at 97:55... >> io/convert_io.rs:97 fn bytes_iter(&mut self) -> RTBytesIterator<'self> { >> io/convert_io.rs:98 RTBytesIterator { tokenizer: self } >> io/convert_io.rs:99 } >> io/convert_io.rs:98:37: 98:43 note: ...due to the following expression >> io/convert_io.rs:98 RTBytesIterator { tokenizer: self } >> ^~~~~~ >> io/convert_io.rs:97:55: 99:5 note: but, the lifetime must be valid for >> the lifetime &'self as defined on the block at 97:55... >> io/convert_io.rs:97 fn bytes_iter(&mut self) -> RTBytesIterator<'self> { >> io/convert_io.rs:98 RTBytesIterator { tokenizer: self } >> io/convert_io.rs:99 } >> io/convert_io.rs:98:8: 98:23 note: ...due to the following expression >> io/convert_io.rs:98 RTBytesIterator { tokenizer: self } >> ^~~~~~~~~~~~~~~ >> error: aborting due to previous error >> >> OK, fair enough, I guess I have to annotate self parameter with 'self >> lifetime: >> >> fn bytes_iter(&'self mut self) -> RTBytesIterator<'self> { >> RTBytesIterator { tokenizer: self } >> } >> >> This compiles, but now I'm getting another error at bytes_iter() usage >> site, for example, the following code: >> >> fn try_read_sep(&mut self, first: u8) -> (~[u8], bool) { >> let mut part = ~[first]; >> for b in self.bytes_iter() { >> part.push(b); >> if !self.is_sep_prefix(part) { >> return (part, false); >> } >> if self.is_sep(part) { >> break; >> } >> } >> return (part, true); >> } >> >> fails to compile with this error: >> >> io/convert_io.rs:117:17: 117:36 error: cannot infer an appropriate >> lifetime due to conflicting requirements >> io/convert_io.rs:117 for b in self.bytes_iter() { >> ^~~~~~~~~~~~~~~~~~~ >> io/convert_io.rs:117:17: 117:22 note: first, the lifetime cannot >> outlive the expression at 117:17... >> io/convert_io.rs:117 for b in self.bytes_iter() { >> ^~~~~ >> io/convert_io.rs:117:17: 117:22 note: ...due to the following expression >> io/convert_io.rs:117 for b in self.bytes_iter() { >> ^~~~~ >> io/convert_io.rs:117:17: 117:36 note: but, the lifetime must be valid >> for the method call at 117:17... >> io/convert_io.rs:117 for b in self.bytes_iter() { >> ^~~~~~~~~~~~~~~~~~~ >> io/convert_io.rs:117:17: 117:22 note: ...due to the following expression >> io/convert_io.rs:117 for b in self.bytes_iter() { >> ^~~~~ >> >> And now I'm completely stuck. I can't avoid these errors at all. This >> looks like a bug to me, but I'm not completely sure - maybe it's me >> who is wrong here. >> >> I've studied libstd/libextra code for clues and found out that some >> iterable structures have code very similar to mine, for example, >> RingBuf. Here is its mut_iter() method: >> >> pub fn mut_iter<'a>(&'a mut self) -> RingBufMutIterator<'a, T> { >> RingBufMutIterator{index: 0, rindex: self.nelts, lo: self.lo, >> elts: self.elts} >> } >> >> I have tried to implement bytes_iter() method like this, but it >> naturally didn't work because of 'a and 'self lifetimes conflict. In >> my understanding, this works here because RingBuf does not have >> lifetime parameter, so no conflict between 'self and 'a lifetime is >> possible at all. But this will not work in my case, because I have to >> have 'self parameter because of &'self Reader field. >> >> What can I do to implement my ReaderTokenizer? Maybe there are other >> ways of which I'm unaware? >> >> Thank you very much in advance. >> >> Best regards, >> Vladimir. >> _______________________________________________ >> Rust-dev mailing list >> [email protected] >> https://mail.mozilla.org/listinfo/rust-dev _______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
