Well after testing and letting my mind recover from being blown so hard, it 
appears you're correct. I never realised that your first call to Enum 
actually operates like that.

Thanks for the advice! I tested this with a 4GB file and it appears to be 
okay. I think I fell into the trap of Enum == List and got caught on 
thinking that my input had to be evaluated before the reduce call would 
happen. Did some src digging and I'm kinda with the flow of things now - 
but I do think that this might be a little under-documented and perhaps 
need some specific examples.

Thanks again :)

On Wednesday, October 26, 2016 at 6:06:48 AM UTC+1, Peter Hamilton wrote:
>
> Doesn't Enum.reduce handle this perfectly fine?
>
> The Enum module doesn't require the entire stream to be evaluated before 
> it starts performing the operation. It requires the entire stream before it 
> emits any values. Since you only have one value you want to emit, and you 
> don't want to emit it until you are done, Enum.reduce does exactly what you 
> want.
>
> On Tue, Oct 25, 2016 at 8:33 PM Isaac Whitfield <[email protected] 
> <javascript:>> wrote:
>
>> First off, I'm sorry if this has been discussed before - I've searched 
>> several places and haven't come across anything so that's why I'm here :) 
>> I'm sure there's a reason why this does not exist, but on the off chance it 
>> was simply never considered here's the formal request!
>>
>> I believe there is currently a gap inside the Stream module with regards 
>> to reductions. As it stands it appears the only option available to someone 
>> who does not wish to buffer an entire File (for example) whilst running a 
>> Stream is Stream.run/1. As the documentation explains, this lets you run a 
>> Stream which may have side effects and ignores the value at the end. I 
>> believe there should be something similar which returns state at the end, 
>> namely Stream.reduce/3.
>>
>> In the past, I have argued against this myself and said that there's no 
>> "good" use case but I've hit one I feel totally justifies the addition 
>> (assuming there's nothing I'm missing). Below is an example use case - it's 
>> a dumb example, but the concept is sound.
>>
>> Consider you're streaming a file containing line separated numbers. Your 
>> task is to count the occurrences of even numbers, but the file is 50GB in 
>> size and you only have 4GB of RAM on your machine (you get the idea). So 
>> you can't just buffer the whole thing, because your app would crash. 
>> Instead you have to resort to a Stream, but you only ever need to keep 
>> track of a single variable; the number of even numbers.
>>
>> If we look at the Stream library, it might appear that Stream.scan/3 is 
>> perfect for this as it passes the result of the last function through to 
>> the next execution - fantastic.
>>
>> So we devise our call:
>>
>> require Integer
>>
>> "my_file.txt"
>> |> File.stream!
>> |> Stream.scan(0, fn(line, count) ->
>>      num = convert_to_number(line)
>>
>>      if Integer.is_even(num) do
>>        count + 1
>>      else
>>        count
>>      end
>>    end)
>>
>> This looks great because it means the last return value of our Stream is 
>> going to be the number of even numbers. But we then hit a wall, because 
>> Stream.run/1 doesn't return anything except :ok - and the only other way to 
>> get the last return value of the Stream is to do Enum.to_list/1 and walk 
>> the entire list until the end and then pull it out, which defeats the point 
>> of what we're trying to do.
>>
>> It's because of this that I feel we need a Stream.reduce/3 which has the 
>> only intent of being able to run a Stream whilst returning a value - 
>> meaning that the last value returned by the callback would be the return 
>> value of the Stream.reduce/3 call.
>>
>> Then you could do:
>>
>> require Integer
>>
>> "my_file.txt"
>> |> File.stream!
>> |> Stream.reduce(0, fn(line, count) ->
>>      num = convert_to_number(line)
>>
>>      if Integer.is_even(num) do
>>        {:cont, count + 1}
>>      else
>>        {:cont, count}
>>      end
>>    end)
>> |> IO.inspect # prints a number of even numbers, without ever buffering 
>> the list
>>
>> This has the powerful ability to return a single value after processing a 
>> Stream without ever holding the entire thing in memory, which is awesome 
>> (imagine all of the neat use cases for analytics apps).
>>
>> I don't know much about how this would be implemented, but it appears to 
>> be a small modification of scan to simply keep the last returned value 
>> rather than add it to a list. That's partly why I feel it should be added - 
>> it's likely that Stream.scan/3 could re-use Stream.reduce/3, meaning 
>> there's hopefully only a small amount of code bloat.
>>
>> Is there a reason this does not/could not exist? The closest thing I've 
>> found to implementing this appears to be with Stream.transform/4 but I have 
>> been unsuccessful and even if I got it to work it feels hacky.
>>
>> Thank you in advance for your replies!
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elixir-lang-core" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elixir-lang-core/4b6a48d2-4855-4877-b00c-d72a26748aac%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elixir-lang-core/4b6a48d2-4855-4877-b00c-d72a26748aac%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/c81bacc4-6ff7-4a11-9662-edc882c543ac%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to