Re: [elixir-core:6588] Flow.chunk ?

Peter C. Marks Fri, 11 Nov 2016 07:22:30 -0800

Thank you José for your critique of my code and your suggested rewrite.
Your code works great! (I just needed to add the parameter t to the call to
find_sequences) I will need to spend a little more time understanding why
you suggested those changes. I plan on blogging about this soon.


Thanks again,

Peter

On Wed, Nov 9, 2016 at 6:57 PM, José Valim <[email protected]>
wrote:

> Thanks Peter!
>
> I believe you don't want to call Flow.chunk/2. Calling Enum.chunk(l, 1)
> before Flow.from_enumerable/2 is the way to go in your case as it
> guarantees *chunks*, and not letters, are spread around on Flow.map/2. If
> instead you called Flow.chunk/2 after Flow.from_enumerable/2, the DNA order
> would be lost by the time you get to Flow.chunk/2. You would effectively
> chunk items in random order. I would possibly only suggest to use
> Stream.chunk/2 instead of Enum.chunk/2 (so you don't need to build all
> chunks upfront).
>
> On the other hand, if you are referring to the inner chunk in Flow.map/2,
> it also wouldn't yield the correct results, because you would be chunking
> groups of "e" and not a single "e" like now.
>
> Finally, it doesn't seem you need partitioning at all as well, since you
> are not reducing over any state (I may have mislead you on a previous
> reply, sorry). My suggestion:
>
> sequence
> |> String.to_charlist
> |> Stream.chunk(l, 1)
> |> Flow.from_enumerable
> |> Flow.flat_map(&find_sequences(&1, k))
> |> Enum.to_list
>
> def find_sequences(e, k) do
>   e
>   |> Enum.chunk(k, 1)
>   |> Enum.reduce(%{}, fn w, acc ->
>        Map.update(acc, w, 1, & &1 + 1)
>      end)
>   |> Enum.reject(fn({_, n}) -> n < t end)
>   |> Enum.map(fn({seq, _}) -> seq end)
> end
>
>
> PS: I haven't tested it.
>
>
>
>
> *José Valim*
> www.plataformatec.com.br
> Skype: jv.ptec
> Founder and Director of R&D
>
> On Wed, Nov 9, 2016 at 11:18 PM, Peter C. Marks <[email protected]>
> wrote:
>
>> Yes, I do use partition.  The full flow is:
>>
>>   sequence
>>   |> String.to_charlist
>>   |> Enum.chunk(l, 1)
>>   |> Flow.from_enumerable
>>   |> Flow.partition
>>   |> Flow.map(fn e -> Enum.chunk(e, k, 1) end)
>>   |> Flow.map(
>>         fn e ->
>>           Enum.reduce(e, %{},
>>             fn w, acc ->
>>               Map.update(acc, w, 1, & &1 + 1)
>>             end)
>>         end)
>>   |> Flow.flat_map(
>>         fn e ->
>>           Enum.reject(e, fn({_, n}) -> n < t end)
>>         end)
>>   |> Flow.map(fn({seq, _}) -> seq end)
>>   |> Enum.to_list
>>
>>
>>
>> On Wed, Nov 9, 2016 at 4:43 PM, José Valim <[email protected].
>> br> wrote:
>>
>>>
>>>
>>>> sequence
>>>> |> String.to_charlist
>>>> |> Enum.chunk(l, 1)
>>>> |> Flow.from_enumerable
>>>> |> Flow.map(fn e -> Enum.chunk(e, k, 1) end)
>>>>
>>>
>>> Do you call partition at some point in your flow? Otherwise it won't
>>> exploit parallelism if you have only one source. Also, if you need to chunk
>>> before you partition, you can chunk before calling from_enumerable:
>>>
>>> sequence
>>> |> String.to_charlist
>>> |> Stream.chunk(e, k, 1)
>>> |> Flow.from_enumerable
>>> |> ...
>>>
>>>
>>> I think it will be easy to add chunking to Flow because we can delegate
>>> to Stream but I just want to make sure I fully understand your use case and
>>> where parallelism is being introduced.
>>>
>>> --
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "elixir-lang-core" group.
>>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>>> pic/elixir-lang-core/Avea6YFZLRQ/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> [email protected].
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/elixir-lang-core/CAGnRm4KpD-tf5p5sAS2nwZusd0reKdti-zL8-w
>>> zf%3DHjD8p%3D5qQ%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KpD-tf5p5sAS2nwZusd0reKdti-zL8-wzf%3DHjD8p%3D5qQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Peter C. Marks
>> @PeterCMarks
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elixir-lang-core" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/elixir-lang-core/CA%2BKdhmg2EZt6SLgE8g_oH%2B-Jjpx075BmPO
>> vfePSvjQZsitXVVg%40mail.gmail.com
>> <https://groups.google.com/d/msgid/elixir-lang-core/CA%2BKdhmg2EZt6SLgE8g_oH%2B-Jjpx075BmPOvfePSvjQZsitXVVg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elixir-lang-core" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/elixir-lang-core/Avea6YFZLRQ/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elixir-lang-core/CAGnRm4Kv-5pBcqEPLi2ejhfze0yKTkh8FTUieQ-
> hon3HB6DsoQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4Kv-5pBcqEPLi2ejhfze0yKTkh8FTUieQ-hon3HB6DsoQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Peter C. Marks
@PeterCMarks

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/CA%2BKdhmj8ejinTB502T5nJWqrfpkDdK%3DN0Ds0LRg78EmOgeHC-w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [elixir-core:6588] Flow.chunk ?

Reply via email to