Hi Jack!

The "tdelete-duplicates" uses a hash-table to store already seen elements: if 
the element is in the hash table, just filter it out.  If it is not, we do: 
(hash-set! already-seen element #t). That should be constant timeish. This 
means that you cannot (and should not) use the same reducing function as 
returned by ((tdelete-duplicates) rcons) for reducing different collections 
(unless you want to delete the duplicates in the first collection in the second 
collection as well). the transducer->reducer process is o(n), and the amount of 
transducers composed is usually small, so there is no need to re-use a reducer 
function unless you actually want to.

I would love some help into making it a racket pkg if you have the time. I have 
absolutely zero experience. I can probbly port the test suite myself, but I 
have quite a lot of similar work to do (srfi-document->texinfo for guile), so 
that will be quite some time in the future.
-- 
  Linus Björnstam

On Tue, 10 Dec 2019, at 12:02, Jack Firth wrote:
> This is great! Thank you for your work on this srfi. Transducers are 
> relatively new to Racket's ecosystem and I'm happy to see more of the 
> design space being explored.
> 
> > The documentation in the SRFI document still holds with some small caveats: 
> > bytevector-u8-reduce/transduce is now bytes-transduce/reduce and 
> > tdelete-duplicates now takes three symbols 'eq?, 'eqv? or 'equal? instead 
> > of arbitrary equality predicates.
> 
> I have a question about that. While implementing the deduplicating 
> <https://docs.racket-lang.org/rebellion/Transducers.html#%28def._%28%28lib._rebellion%2Fstreaming%2Ftransducer..rkt%29._deduplicating%29%29>
>  transducer, I tried to figure out a way to make it support arbitrary 
> equality relations. But as far as I could tell 
> <https://github.com/jackfirth/rebellion/issues/310#issuecomment-546567800>, 
> it's impossible to do that in less than quadratic time because there's no way 
> to check whether something is a duplicate without comparing it to every 
> unique element seen so far. Is there an implementation of this SRFI that 
> figured out an efficient way to do this in the general case?
> 
> > I have really no idea how to package things for racket, nor do I have much 
> > interest in doing so. I have played with it a bit, it seems to work. There 
> > is a small test-suite still written in guile scheme, which should be 
> > trivial to port. If someone wants to package this as a proper racket 
> > package, I would be happy to accept pull requests or even transfer the 
> > repository to someone else. The license used is sub-licenseable, so feel 
> > free. If I find the repo to my liking I will deprecate my port and link to 
> > you.
> > 
> > Ideas for changes I would have done if I was using racket more than I 
> actually do: a (transduce ...)-form that uses rackets sequence 
> interface. Package it so It could be installed using raco. Port the 
> test-suite. Another part is figuring out what to call the library. 
> I'd be happy to write an `info.rkt` that packages the code. For a 
> package name, I suggest `srfi-171`. Would you be interested in writing 
> Scribble documentation for the package?
> 
> > The rebellion transducers seem to be aimed at another use-case (streaming 
> > data, and parallelism) with other kind of guarrantees. It has a 
> > _significantly_ higher overhead. I tried the following in rebellion 
> > 
> > #lang racket 
> > (require rebellion/streaming/transducer) 
> > (require rebellion/collection/list) 
> > 
> > (define lst (build-list 100000 values)) 
> > 
> > (time (length (transduce (in-list lst) 
> >  (filtering odd?) 
> >  (mapping (lambda (x) (* x x))) 
> >  #:into into-list))) 
> > 
> > and it takes about 5 seconds. The same thing using srfi-171 styled 
> > transducers takes about 3ms (if my head is with me this morning, that is 
> > about 1650x). I suspect I might be doing something wrong, or that there 
> > might be some quadratic behaviour somewhere.
> 
> You're not doing anything wrong. Rebellion's transducers have awful 
> constant (but not quadratic) factors and make lots of unnecessary 
> allocations. There's no fundamental reason for this; the protocol can 
> support efficient zero-allocation transducer implementations perfectly 
> fine. I just haven't put work into that yet. See this issue 
> <https://github.com/jackfirth/rebellion/issues/351> for a summary of 
> the problem and some possible ways to improve performance. For anyone 
> reading this who's interested in benchmarking and optimizing tight 
> loops, I'd love to hear your input.
> 
>  -- 
>  You received this message because you are subscribed to the Google 
> Groups "Racket Users" group.
>  To unsubscribe from this group and stop receiving emails from it, send 
> an email to racket-users+unsubscr...@googlegroups.com.
>  To view this discussion on the web visit 
> https://groups.google.com/d/msgid/racket-users/8c036a47-6d90-4bcd-bfd3-fdfdd2f8d05b%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/racket-users/8c036a47-6d90-4bcd-bfd3-fdfdd2f8d05b%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/9bd80742-344d-4d18-91d6-6ebaa28576b1%40www.fastmail.com.

Reply via email to